CUDA Trace


NVIDIA System Profiler is capable of capturing information about CUDA execution in the profiled process.

The following CUDA driver and runtime (toolkit) versions are currently supported: 6.5, 7.0, and 8.0.

The following information can be collected and presented on the timeline in the report:

CUDA thread rows

Near the bottom of the timeline row tree, the GPU node will appear and contain a CUDA node. Within the CUDA node, each CUDA context used within the process will be shown along with its corresponding CUDA streams. Steams will contain memory operations and kernel launches on the GPU. Kernel launches are represented by blue, while memory transfers are displayed in red.

CUDA GPU rows

The easiest way to capture CUDA information is to launch the process from NVIDIA System Profiler, and it will setup the environment for you. To do so, simply set up a normal launch and select the Collect CUDA trace checkbox.

Configure CUDA trace

Additional configuration parameters are available:

If desired, the target application can be manually set up to collect CUDA trace. To capture information about CUDA execution, the following requirements should be satisfied:

If the application is started by NVIDIA System Profiler, all required environment variables will be set automatically.

Please note that if your application crashes before all collected CUDA trace data has been copied out, some or all data might be lost and not present in the report.


 

NVIDIA® System Profiler Documentation Rev. 3.9.170817 ©2017. NVIDIA Corporation. All Rights Reserved.