L4T Multimedia API Reference

27.1 Release

 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
TensorRT Sample with Arbitrary Batch Size

Overview

This sample demonstrates the simplest way to use NVIDIA® TensorRT to decode video and save the bounding box information to the result.txt file. TensorRT was previously known as GPU Inference Engine (GIE).

The data pipeline is as follow:

Input video file -> Decoder -> VIC -> TensorRT Inference -> Plain text file with Bounding Box info

Operation Flow

The sample does the following:

  1. Encodes the input video stream.
  2. Performs one-channel video decodeVIC, which does the following:
    • Converts the buffer layout from block linear to pitch linear.
    • Scales the image resolution to what's required by TensorRT.
  3. Uses TensorRT performs object identification and adds a bounding box to identified object of original frame.
  4. Converts YUV to RGB format and save it in a file.

The block diagram contains not only the pipeline, but also the memory sharing information among different engines, which can be a reference for the other samples.

Additional Details

  • README provides more details about how to build, run and implement this sample.

Prerequisites

Before running the sample, you must have the following:

  • CUDA
  • TensorRT (previously known as GPU Inference Engine (GIE))
  • OpenCV4Tegra

This samples does not require a Camera or display.


Key Structure and Classes

This sample uses the following key structures and classes:

The global structure context_t manages all the resources in the application.

ElementDescription
NvVideoDecoderContains all video decoding-related elements and functions.
NvVideoConverterContains elements and functions for video format conversion.
EGLDisplayThe EGLImage used for CUDA processing.
conv_output_plane_buf_queueOutput plane queue for video conversion.
GIE_ContextProvide a series of interfaces to load Caffemodel and do inference.

Key Thread

MemberDescription
decCaptureLoopGet buffers from dec capture plane and push to converter, and handle resolution change.
Conv outputPlane dqThreadReturn the buffers dequeued from converter output plane to decoder capture plane.
Conv captuerPlane dqThreadGet buffers from conv capture plane and push to the TensorRT buffer queue.
gieThreadCUDA process and doing inference.

Programming Notes

  • The result saves the normalized rectangle within [0,1].
  • Can use 02_video_dec_cuda sample to display and verify the result and scale the rectangle parameters with the following command:
     $ ./video_dec_cuda <in-file> <in-format> --bbox-file result.txt
    
  • Supports in-stream resolution changes.
  • The default deploy file is

     GoogleNet-modified.prototxt
    

    The default model file is

    GoogleNet-modified-online_iter_30000.caffemodel
    

    In this directory:

    $SDKDIR/data/model
    
  • The batch size can be changed in GoogleNet-modified.prototxt. The batch size is limited by the memory. The biggest batch size for Ivanet is less than 40.
  • End-of-stream (EOS) process:

    a. Completely read the file.

    b. Push a null v4l2buf to decoder.

    c. Decoder waits for all output plane buffers to return.

    d. Set get_eos:

      decCap thread exit
    

    e. End the TensorRT thread.

    f. Send EOS to the converter:

       conv output plane dqThread callback return false
       conv output plane dqThread exit
       conv capture plane dqThread callback return false
       conv capture plane dqThread exit
    

    g. Delete the decoder:

      deinit output plane and capture plane buffers
    

    h. Delete the converter:

       unmap capture plane buffers
    

Command Line Options

./video_dec_gie <in-file> <in-format> [options]

Use the -h option to view the currently supported options.