L4T Multimedia API Reference

28.2 Release

 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
multimedia_api/ll_samples/docs/l4t_mm_video_dec_tensorrt.md
Go to the documentation of this file.
1 Copyright (c) 2016-2017, NVIDIA CORPORATION. All rights reserved.
2 
3 @page l4t_mm_vid_decode_trt 04_video_dec_trt
4 @{
5 
6  - [Overview](#overview)
7  - [Building and Running](#build_and_run)
8  - [Flow](#flow)
9  - [Key Structure and Classes](#key)
10  - [Key Thread](#key_thread)
11  - [Programming Notes](#notes)
12  - [Command Line Options](#options)
13 
14 - - - - - - - - - - - - - - -
15 <a name="overview">
16 ## Overview ##
17 
18 This sample demonstrates the simplest way to use NVIDIA<sup>&reg;</sup>
19 TensorRT<sup>&tm;</sup> to decode video and save the bounding box information to
20 the `result.txt` file. TensorRT was previously known as GPU Inference Engine
21 (GIE).
22 
23 This samples does not require a Camera or display.
24 
25 <a name="build_and_run">
26 - - - - - - - - - - - - - - -
27 ## Building and Running ##
28 
29 #### Prerequisites ####
30 * You have followed Steps 1-3 in @ref mmapi_build.
31 * You have installed the following:
32  - CUDA
33  - TensorRT (previously known as GPU Inference Engine (GIE))
34  - OpenCV
35 
36 
37 ### To build:
38 * Enter:
39 
40  $ cd $HOME/tegra_multimedia_api/samples/04_video_dec_trt
41  $ make
42 
43 ### To run
44 * Enter:
45 
46  $ ./video_dec_trt <in-file> <in-format> [options]
47 
48 ### Example
49 
50  $ ./video_dec_trt ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 \
51  --trt-deployfile ../../data/Model/GoogleNet_one_class/GoogleNet_modified_oneClass_halfHD.prototxt \
52  --trt-modelfile ../../data/Model/GoogleNet_one_class/GoogleNet_modified_oneClass_halfHD.caffemodel
53 
54 ### Notes
55 
56 1. The sample generates result.txt by default.The result saves the normalized rectangle within [0,1].
57 
58  For example, the following shows the beginning of a result file:
59 
60  frame:0 class num:0 has rect:3
61  x,y,w,h:0 0.333333 0.00208333 0.344444
62  x,y,w,h:0.492708 0.474074 0.05625 0.0888889
63  x,y,w,h:0.558333 0.609259 0.09375 0.181481
64 
65  frame:1 class num:0 has rect:3
66  x,y,w,h:0 0.333333 0.003125 0.346296
67  x,y,w,h:0.491667 0.474074 0.0572917 0.0907407
68  x,y,w,h:0.558333 0.612963 0.0979167 0.185185
69 
70  frame:2 class num:0 has rect:3
71  x,y,w,h:0 0.331481 0.003125 0.344444
72  x,y,w,h:0.492708 0.472222 0.0583333 0.0925926
73  x,y,w,h:0.559375 0.614815 0.102083 0.194444
74 
75  frame:3 class num:0 has rect:2
76  x,y,w,h:0 0.32963 0.00208333 0.346296
77  x,y,w,h:0.560417 0.635185 0.105208 0.198148
78 
79  frame:4 class num:0 has rect:2
80  x,y,w,h:0 0.32037 0.00208333 0.348148
81  x,y,w,h:0.561458 0.648148 0.113542 0.214815
82 
83  frame:5 class num:0 has rect:3
84  x,y,w,h:0 0.324074 0.00208333 0.35
85  x,y,w,h:0.494792 0.487037 0.059375 0.087037
86  x,y,w,h:0.563542 0.661111 0.120833 0.225926
87 2. `02_video_dec_cuda` can verify the result and scale the rectangle parameters
88  with the following command:
89 
90  $ cp result.txt $HOME/tegra_multimedia_api/samples/02_video_dec_cuda
91  $ cd $HOME/tegra_multimedia_api/samples/02_video_dec_cuda
92  $ ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 --bbox-file result.txt
93 3. Supports in-stream resolution changes.
94 4. The default deploy file is:
95 
96  ../../data/Model/GoogleNet_one_class/GoogleNet_modified_oneClass_halfHD.prototxt
97  and the model file is:
98 
99  ../../data/Model/GoogleNet_one_class/GoogleNet_modified_oneClass_halfHD.caffemodel
100  5. The batch size can be changed in `GoogleNet_modified_oneClass_halfHD.prototxt`, but is limited
101  by the available system memory.
102  6. The following command must be executed before running with a new batch size:
103 
104  $ rm trtModel.cache
105 
106 <a name="flow">
107 - - - - - - - - - - - - - - -
108 ## Flow
109 
110 The data pipeline is as follow:
111 
112  Input video file -> Decoder -> VIC -> TensorRT Inference -> Plain text file with Bounding Box info
113 
114 #### Operation Flow ####
115 
116 The sample does the following:
117 
118 1. Encodes the input video stream.
119 2. Performs one-channel video decodeVIC, which does the following:
120  - Converts the buffer layout from block linear to pitch linear.
121  - Scales the image resolution to what's required by TensorRT.
122 3. Uses TensorRT performs object identification and adds a bounding box to identified
123  object of original frame.
124 4. Converts YUV to RGB format and save it in a file.
125 
126 ![](l4t_mm_video_dec_tensorrt.jpg)
127 
128 The block diagram contains not only the pipeline, but also the memory sharing information among different engines, which can be a reference for the other samples.
129 
130 
131 - - - - - - - - - - - - - - -
132 <a name="key">
133 ## Key Structure and Classes ##
134 
135 This sample uses the following key structures and classes:
136 
137 The global structure `context_t` manages all the resources in the application.
138 
139 |Element|Description|
140 |---|---|
141 |NvVideoDecoder|Contains all video decoding-related elements and functions.|
142 |NvVideoConverter|Contains elements and functions for video format conversion.|
143 |EGLDisplay|The EGLImage used for CUDA processing.|
144 |conv_output_plane_buf_queue|Output plane queue for video conversion.|
145 |TRT_Context|Provide a series of interfaces to load Caffemodel and do inference.|
146 
147 
148 <a name="key_thread">
149 ## Key Thread ##
150 
151 |Member|Description|
152 |---|---|
153 |decCaptureLoop|Get buffers from dec capture plane and push to converter, and handle resolution change.|
154 |Conv outputPlane dqThread|Return the buffers dequeued from converter output plane to decoder capture plane.|
155 |Conv captuerPlane dqThread|Get buffers from conv capture plane and push to the TensorRT buffer queue.|
156 |trtThread|CUDA process and doing inference.|
157 
158 <a name="notes">
159 ## Programming Notes ##
160 
161 
162 - The result saves the normalized rectangle within [0,1].
163 - Can use 02_video_dec_cuda sample to display and verify the result and scale the rectangle parameters with the following command:
164 
165  $ ./video_dec_cuda <in-file> <in-format> --bbox-file result.txt
166 
167 - Supports in-stream resolution changes.
168 - The default deploy file is
169 
170  GoogleNet_modified_oneClass_halfHD.prototxt
171 
172  The default model file is
173 
174  GoogleNet_modified_oneClass_halfHD.caffemodel
175 
176  In this directory:
177 
178  $SDKDIR/data/Model/GoogleNet_one_class
179 
180 - The batch size can be changed in GoogleNet_modified_oneClass_halfHD.prototxt.
181  The batch size is limited by the memory.
182  The biggest batch size for Ivanet is less than 40.
183 - End-of-stream (EOS) process:
184 
185  a. Completely read the file.
186 
187  b. Push a null `v4l2buf` to decoder.
188 
189  c. Decoder waits for all output plane buffers to return.
190 
191  d. Set `get_eos`:
192 
193  decCap thread exit
194 
195  e. End the TensorRT thread.
196 
197  f. Send EOS to the converter:
198 
199  conv output plane dqThread callback return false
200  conv output plane dqThread exit
201  conv capture plane dqThread callback return false
202  conv capture plane dqThread exit
203 
204  g. Delete the decoder:
205 
206  deinit output plane and capture plane buffers
207 
208  h. Delete the converter:
209 
210  unmap capture plane buffers
211 
212 
213 <a name="options">
214 ## Command Line Options ##
215 
216  ./video_dec_trt <in-file> <in-format> [options]
217 
218 |Option|Description|
219 |--|--|
220 |`--`trt-deployfile|Sets deploy file name.|
221 |`--`trt-modelfile|Sets the model file name.|
222 |`--`trt-float32 <int>|Specifies to use float16 or not[0-2], where <int> is one of the following:<ul><li>0 use default</li><li>1 float16</li><li>2 float32</li></ul>|
223 |`--`trt-enable-perf|Enables performance measurement.|
224 
225 @}
226 
227 
228 
Defines a helper class for V4L2 Video Decoder.
Defines a helper class for V4L2 Video Converter.