NVTX Trace


The NVIDIA Tools Extension Library (NVTX) is a powerful mechanism that allows users to manually instrument their application. NVIDIA System Profiler can then collect the information and present it on the timeline.

NVIDIA System Profiler supports version 2.0 of the NVTX specification.

The following features are supported:

To learn more about specific features of NVTX, please refer to the NVTX header file: nvToolsExt.h.

To use NVTX in your application, follow these steps:

  1. Add #include "nvToolsExt.h" in your source code. This header file is located in the Target-arm/nvtx/include directory on the host.

  2. Link with libnvToolsExt.a static library (-lnvToolsExt compiler flag).

    • For Android, the library is located in Target-arm/armv7 and Target-arm/armv8 directories on the host.

    • For Linux, the library is located in Target-arm-linux/armv7 and Target-arm-linux/armv8 directories on the host.

  3. Add the following compiler flags as well: -pthread -ldl -lrt.

  4. Add calls to the NVTX API functions. For example, try adding nvtxRangePushA("main") in the beginning of the main() function, and nvtxRangePop() just before the return statement in the end.

    • For convenience in C++ code, consider adding wrapper that implements RAII (resource acquisition is initialization) pattern, which would guarantee that every range gets closed.
  5. In the project settings, select the Collect NVTX trace checkbox.

  6. On Android, make sure that your application is launched by NVIDIA System Profiler. This is required so that the necessary launch environment is prepared, and the library responsible for collection of NVTX trace data is properly injected into the process.

  7. On Linux, if the application is launched by NVIDIA System Profiler, all required environment variables will be set up automatically.

  8. On Linux, if launching the application manually, the following environment variables should be specified:

    • For ARMv7 processes:

      NVTX_INJECTION32_PATH=/opt/nvidia/tegra_system_profiler/libToolsInjection32.so
      
    • For ARMv8 processes:

      NVTX_INJECTION64_PATH=/opt/nvidia/tegra_system_profiler/libToolsInjection64.so
      

Typically calls to NVTX functions can be left in the source code even if the application is not being built for profiling purposes, since the overhead is very low when the profiler is not attached.

NVTX is not intended to annotate very small pieces of code that are being called very frequently. A good rule of thumb to use: if code being annotated usually takes less than 1 microsecond to execute, adding an NVTX range around this code should be done carefully.

Note: range annotations should be matched carefully. If many ranges are open but not closed, SP has no meaningful way to visualize it. A rule of thumb is to not have more than a couple dozen ranges open at any point of time. NVIDIA System Profiler does not support reports with many unclosed ranges.


 

NVIDIA® System Profiler Documentation Rev. 3.9.170817 ©2017. NVIDIA Corporation. All Rights Reserved.