1 Copyright (c) 2016-2017, NVIDIA CORPORATION. All rights reserved.
3 @page l4t_mm_vid_decode_trt 04_video_dec_trt
6 - [Overview](#overview)
7 - [Building and Running](#build_and_run)
9 - [Key Structure and Classes](#key)
10 - [Key Thread](#key_thread)
11 - [Programming Notes](#notes)
12 - [Command Line Options](#options)
14 - - - - - - - - - - - - - - -
18 This sample demonstrates the simplest way to use NVIDIA<sup>®</sup>
19 TensorRT<sup>&tm;</sup> to decode video and save the bounding box information to
20 the `result.txt` file. TensorRT was previously known as GPU Inference Engine
23 This samples does not require a Camera or display.
25 <a name=
"build_and_run">
26 - - - - - - - - - - - - - - -
27 ## Building and Running ##
29 #### Prerequisites ####
30 * You have followed Steps 1-3 in @ref mmapi_build.
31 * You have installed the following:
32 - NVIDIA<sup>®</sup> CUDA<sup>®</sup>
33 - TensorRT (previously known as GPU Inference Engine (GIE))
40 $ cd $HOME/multimedia_api/samples/04_video_dec_trt
46 $ ./video_dec_trt [Channel-num] <in-file1> <in-file2> ... <in-format> [options]
50 The following example generates two results: `result0.txt` and `result1.txt`.
51 The results contain normalized rectangle coordinates
for detected objects.
53 $ ./video_dec_trt 2 ../../data/Video/sample_outdoor_car_1080p_10fps.h264 \
54 ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 \
55 --trt-deployfile ../../data/Model/resnet10/resnet10.prototxt \
56 --trt-modelfile ../../data/Model/resnet10/resnet10.caffemodel \
61 - Boost the clock before running performance.
63 $ sudo ~/jetson_clocks.sh
65 - To change the batch size, use the `Channel-num` option.
66 - For information on opening more than 16 video devices, see the following
67 NVIDIA<sup>®</sup> DevTalk topic:
69 <a href=
"https://devtalk.nvidia.com/default/topic/1025375/" target=
"_blank">https:
71 - If the mode or any other parameter is changed, run the following command.
75 - The log shows the performance results with the following syntax:
77 Inference Performance(ms per batch):xx Wait from decode takes(ms per batch):xx
79 - To verify the result and scale the rectangle parameters, enter the following
82 $ cp result*.txt $HOME/multimedia_api/samples/02_video_dec_cuda
83 $ cd $HOME/multimedia_api/samples/02_video_dec_cuda
84 $ ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 --bbox-file result0.txt
85 $ ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 --bbox-file result1.txt
88 - - - - - - - - - - - - - - -
91 The data pipeline is as follow:
93 Input video file -> Decoder -> VIC -> TensorRT Inference -> Plain text file with Bounding Box info
95 #### Operation Flow ####
97 The sample does the following:
99 1. Encodes the input video stream.
100 2. Performs one-channel video decodeVIC, which does the following:
101 - Converts the buffer layout from block linear to pitch linear.
102 - Scales the image resolution to the resolution that TensorRT requires.
103 3. Uses TensorRT to perform object identification and adds a bounding box to
104 the object identified in the original frame.
105 4. Converts the image from YUV to RGB format and saves it in a file.
108 The following block diagram shows the video decoder pipeline and memory sharing
109 between different engines. This memory sharing also applies to other
110 L4T Multimedia samples.
112 
115 - - - - - - - - - - - - - - -
117 ## Key Structure and Classes ##
119 This sample uses the following key structures and classes:
121 The global structure `context_t` manages all the resources in the application.
123 |Element|Description|
125 |
NvVideoDecoder|Contains all video decoding-related elements and functions.|
126 |
NvVideoConverter|Contains elements and functions for video format conversion.|
127 |EGLDisplay|Specifies the EGLImage used for CUDA processing.|
128 |conv_output_plane_buf_queue|Specifies the output plane queue for video conversion.|
129 |
TRT_Context|Specifies interfaces for loading Caffemodel and performing inference.|
132 <a name=
"key_thread">
137 |decCaptureLoop|Gets buffers from dec capture plane and push to converter, and handle resolution change.|
138 |Conv outputPlane dqThread|Returns the buffers dequeued from converter output plane to decoder capture plane.|
139 |Conv captuerPlane dqThread|Gets buffers from conv capture plane and push to the TensorRT buffer queue.|
140 |trtThread|Specifies the CUDA process and inference characteristics.|
143 ## Programming Notes ##
145 To display and verify the results and to scale the rectangle parameters, use the
146 `02_video_dec_cuda` sample as follows:
148 $ ./video_dec_cuda <in-file> <in-format> --bbox-file result.txt
151 The sample does the following:
152 - Saves the resulting normalized rectangle within [0,1].
153 - Supports in-stream resolution changes.
154 - Uses the default file:
158 The default model file is
164 $SDKDIR/data/Model/resnet10
166 - Performs end-of-stream (EOS) processesing as follows:
168 a. Completely reads the file.
170 b. Pushes a null `v4l2buf` to decoder.
172 c. Waits for all output plane buffers to return.
178 e. Ends the TensorRT thread.
180 f. Sends EOS to the converter:
182 conv output plane dqThread callback return false
183 conv output plane dqThread exit
184 conv capture plane dqThread callback return false
185 conv capture plane dqThread exit
187 g. Deletes the decoder:
189 deinit output plane and capture plane buffers
191 h. Deletes the converter:
193 unmap capture plane buffers
197 ## Command Line Options ##
199 ./video_dec_trt [Channel-num] <in-file1> <in-file2> ... <in-format> [options]
203 |`--`trt-deployfile|Sets deploy file name.|
204 |`--`trt-modelfile|Sets the model file name.|
205 |`--`trt-mode <int>|Specifies to use float16 or not[0-2], where <int> is one of the following:<ul><li>0 float16</li><li>1 float32</li><li>2 int8</li></ul>|
206 |`--`trt-enable-perf|Enables performance measurement.|
Defines a helper class for V4L2 Video Decoder.
Defines a helper class for V4L2 Video Converter.