1 Copyright (c) 2016-2017, NVIDIA CORPORATION. All rights reserved.
3 @page l4t_mm_vid_decode_trt 04_video_dec_trt
6 - [Overview](#overview)
7 - [Building and Running](#build_and_run)
9 - [Key Structure and Classes](#key)
10 - [Key Thread](#key_thread)
11 - [Programming Notes](#notes)
12 - [Command Line Options](#options)
14 - - - - - - - - - - - - - - -
18 This sample demonstrates the simplest way to use NVIDIA<sup>®</sup>
19 TensorRT<sup>&tm;</sup> to decode video and save the bounding box information to
20 the `result.txt` file. TensorRT was previously known as GPU Inference Engine
23 This samples does not require a Camera or display.
25 <a name=
"build_and_run">
26 - - - - - - - - - - - - - - -
27 ## Building and Running ##
29 #### Prerequisites ####
30 * You have followed Steps 1-3 in @ref mmapi_build.
31 * You have installed the following:
33 - TensorRT (previously known as GPU Inference Engine (GIE))
40 $ cd $HOME/tegra_multimedia_api/samples/04_video_dec_trt
46 $ ./video_dec_trt <in-file> <in-format> [options]
50 $ ./video_dec_trt ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 \
51 --trt-deployfile ../../data/Model/GoogleNet_one_class/GoogleNet_modified_oneClass_halfHD.prototxt \
52 --trt-modelfile ../../data/Model/GoogleNet_one_class/GoogleNet_modified_oneClass_halfHD.caffemodel
56 1. The sample generates result.txt by
default.The result saves the normalized rectangle within [0,1].
58 For example, the following shows the beginning of a result file:
60 frame:0
class num:0 has rect:3
61 x,y,w,h:0 0.333333 0.00208333 0.344444
62 x,y,w,h:0.492708 0.474074 0.05625 0.0888889
63 x,y,w,h:0.558333 0.609259 0.09375 0.181481
65 frame:1 class num:0 has rect:3
66 x,y,w,h:0 0.333333 0.003125 0.346296
67 x,y,w,h:0.491667 0.474074 0.0572917 0.0907407
68 x,y,w,h:0.558333 0.612963 0.0979167 0.185185
70 frame:2 class num:0 has rect:3
71 x,y,w,h:0 0.331481 0.003125 0.344444
72 x,y,w,h:0.492708 0.472222 0.0583333 0.0925926
73 x,y,w,h:0.559375 0.614815 0.102083 0.194444
75 frame:3 class num:0 has rect:2
76 x,y,w,h:0 0.32963 0.00208333 0.346296
77 x,y,w,h:0.560417 0.635185 0.105208 0.198148
79 frame:4 class num:0 has rect:2
80 x,y,w,h:0 0.32037 0.00208333 0.348148
81 x,y,w,h:0.561458 0.648148 0.113542 0.214815
83 frame:5 class num:0 has rect:3
84 x,y,w,h:0 0.324074 0.00208333 0.35
85 x,y,w,h:0.494792 0.487037 0.059375 0.087037
86 x,y,w,h:0.563542 0.661111 0.120833 0.225926
87 2. `02_video_dec_cuda` can verify the result and scale the rectangle parameters
88 with the following command:
90 $ cp result.txt $HOME/tegra_multimedia_api/samples/02_video_dec_cuda
91 $ cd $HOME/tegra_multimedia_api/samples/02_video_dec_cuda
92 $ ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 --bbox-file result.txt
93 3. Supports in-stream resolution changes.
94 4. The default deploy file is:
96 ../../data/Model/GoogleNet_one_class/GoogleNet_modified_oneClass_halfHD.prototxt
97 and the model file is:
99 ../../data/Model/GoogleNet_one_class/GoogleNet_modified_oneClass_halfHD.caffemodel
100 5. The batch size can be changed in `GoogleNet_modified_oneClass_halfHD.prototxt`, but is limited
101 by the available system memory.
102 6. The following command must be executed before running with a new batch size:
107 - - - - - - - - - - - - - - -
110 The data pipeline is as follow:
112 Input video file -> Decoder -> VIC -> TensorRT Inference -> Plain text file with Bounding Box info
114 #### Operation Flow ####
116 The sample does the following:
118 1. Encodes the input video stream.
119 2. Performs one-channel video decodeVIC, which does the following:
120 - Converts the buffer layout from block linear to pitch linear.
121 - Scales the image resolution to what's required by TensorRT.
122 3. Uses TensorRT performs object identification and adds a bounding box to identified
123 object of original frame.
124 4. Converts YUV to RGB format and save it in a file.
126 
128 The block diagram contains not only the pipeline, but also the memory sharing information among different engines, which can be a reference for the other samples.
131 - - - - - - - - - - - - - - -
133 ## Key Structure and Classes ##
135 This sample uses the following key structures and classes:
137 The global structure `context_t` manages all the resources in the application.
139 |Element|Description|
141 |
NvVideoDecoder|Contains all video decoding-related elements and functions.|
142 |
NvVideoConverter|Contains elements and functions for video format conversion.|
143 |EGLDisplay|The EGLImage used for CUDA processing.|
144 |conv_output_plane_buf_queue|Output plane queue for video conversion.|
145 |
TRT_Context|Provide a series of interfaces to load Caffemodel and do inference.|
148 <a name="key_thread">
153 |decCaptureLoop|Get buffers from dec capture plane and push to converter, and handle resolution change.|
154 |Conv outputPlane dqThread|Return the buffers dequeued from converter output plane to decoder capture plane.|
155 |Conv captuerPlane dqThread|Get buffers from conv capture plane and push to the TensorRT buffer queue.|
156 |trtThread|CUDA process and doing inference.|
159 ## Programming Notes ##
162 - The result saves the normalized rectangle within [0,1].
163 - Can use 02_video_dec_cuda sample to display and verify the result and scale the rectangle parameters with the following command:
165 $ ./video_dec_cuda <in-file> <in-format> --bbox-file result.txt
167 - Supports in-stream resolution changes.
168 - The default deploy file is
170 GoogleNet_modified_oneClass_halfHD.prototxt
172 The default model file is
174 GoogleNet_modified_oneClass_halfHD.caffemodel
178 $SDKDIR/data/Model/GoogleNet_one_class
180 - The batch size can be changed in GoogleNet_modified_oneClass_halfHD.prototxt.
181 The batch size is limited by the memory.
182 The biggest batch size for Ivanet is less than 40.
183 - End-of-stream (EOS) process:
185 a. Completely read the file.
187 b. Push a null `v4l2buf` to decoder.
189 c. Decoder waits for all output plane buffers to return.
195 e. End the TensorRT thread.
197 f. Send EOS to the converter:
199 conv output plane dqThread callback return false
200 conv output plane dqThread exit
201 conv capture plane dqThread callback return false
202 conv capture plane dqThread exit
204 g. Delete the decoder:
206 deinit output plane and capture plane buffers
208 h. Delete the converter:
210 unmap capture plane buffers
214 ## Command Line Options ##
216 ./video_dec_trt <in-file> <in-format> [options]
220 |`--`trt-deployfile|Sets deploy file name.|
221 |`--`trt-modelfile|Sets the model file name.|
222 |`--`trt-float32 <int>|Specifies to use float16 or not[0-2], where <int> is one of the following:<ul><li>0 use default</li><li>1 float16</li><li>2 float32</li></ul>|
223 |`--`trt-enable-perf|Enables performance measurement.|
Defines a helper class for V4L2 Video Decoder.
Defines a helper class for V4L2 Video Converter.