NVIDIA CUDA C SDK - Image Processing

The GPU Computing SDK includes 100+ code samples, utilities, whitepapers, and additional documentation to help you get started developing, porting, and optimizing your applications for the CUDA architecture. You can get quick access to many of the SDK resources on this page, SDK documentation, or download the complete SDK.

Please note that you may need to install the latest NVIDIA drivers and CUDA Toolkit to compile and run the code samples.

Refer to the SDK release notes for more information.


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Volumetric Filtering with 3D Textures and Surface Writes
This sample demonstrates 3D Volumetric Filtering using 3D Textures and 3D Surface Writes.
  Minimum Required GPU
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Optical Flow
Variational optical flow estimation example. Uses textures for image operations. Shows how simple PDE solver can be accelerated with CUDA.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. CUDA Video Encode (C Library) API
This sample demonstrates how to effectively use the CUDA Video Encoder API encode H.264 video. Video input in YUV formats are taken as input (either CPU system or GPU memory) and video output frames are encoded to an H.264 file
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Bilateral Filter
Bilateral filter is an edge-preserving non-linear smoothing filter that is implemented with CUDA with OpenGL rendering. It can be used in image recovery and denoising. Each pixel is weight by considering both the spatial distance and color distance between its neibors. Reference:"C. Tomasi, R. Manduchi, Bilateral Filtering for Gray and Color Images, proceeding of the ICCV, 1998, http://users.soe.ucsc.edu/~manduchi/Papers/ICCV98.pdf"
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Simple Surface Write
Simple example that demonstrates the use of 2D surface references (Write-to-Texture)
  Minimum Required GPU
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Function Pointers
This sample illustrates how to use function pointers and implements the Sobel Edge Detection filter for 8-bit monochrome images.
  Minimum Required GPU
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Pitch Linear Texture
Use of Pitch Linear Textures
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Simple Texture
Simple example that demonstrates use of Textures in CUDA.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Simple Texture (Driver Version)
Simple example that demonstrates use of Textures in CUDA. This sample uses the new CUDA 4.0 kernel launch Driver API.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Simple Texture 3D
Simple example that demonstrates use of 3D Textures in CUDA.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. DCT8x8
This sample demonstrates how Discrete Cosine Transform (DCT) for blocks of 8 by 8 pixels can be performed using CUDA: a naive implementation by definition and a more traditional approach used in many libraries. As opposed to implementing DCT in a fragment shader, CUDA allows for an easier and more efficient implementation.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. 1D Discrete Haar Wavelet Decomposition
Discrete Haar wavelet decomposition for 1D signals with a length which is a power of 2.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Fast Walsh Transform
Naturally(Hadamard)-ordered Fast Walsh Tranform for batched vectors of arbitrary eligible(power of two) lengths
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. CUDA Histogram
This sample demonstrates efficient implementation of 64-bin and 256-bin histogram.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Box Filter
Fast image box filter using CUDA with OpenGL rendering.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Post-Process in OpenGL
This sample shows how to post-process an image rendered in OpenGL using CUDA.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. DirectX Texture Compressor (DXTC)
High Quality DXT Compression using CUDA. This example shows how to implement an existing computationally-intensive CPU compression algorithm in parallel on the GPU, and obtain an order of magnitude performance improvement.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Image denoising
This sample demonstrates two adaptive image denoising technqiues: KNN and NLM, based on computation of both geometric and color distance between texels. While both techniques are implemented in the DirectX SDK using shaders, massively speeded up variation of the latter techique, taking advantage of shared memory, is implemented in addition to DirectX counterparts.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Sobel Filter
This sample implements the Sobel edge detection filter for 8-bit monochrome images.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Recursive Gaussian Filter
This sample implements a Gaussian blur using Deriche's recursive method. The advantage of this method is that the execution time is independent of the filter width.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. CUDA Video Decoder D3D9 API
This sample demonstrates how to efficiently use the CUDA Video Decoder API to decode MPEG-2, VC-1, or H.264 sources. YUV to RGB conversion of video is accomplished with CUDA kernel. The output result is rendered to a D3D9 surface. The decoded video is not displayed on the screen, but with -displayvideo at the command line parameter, the video output can be seen. Requires a Direct3D capable device and Compute Capability 1.1 or higher.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. CUDA Video Decoder GL API
This sample demonstrates how to efficiently use the CUDA Video Decoder API to decode video sources based on MPEG-2, VC-1, and H.264. YUV to RGB conversion of video is accomplished with CUDA kernel. The output result is rendered to a OpenGL surface. The decoded video is black, but can be enabled with -displayvideo added to the command line. Requires Compute Capability 1.1 or higher.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Bicubic Texture Filtering
This sample demonstrates how to efficiently implement bicubic Texture filtering in CUDA.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. FFT-Based 2D Convolution
This sample demonstrates how 2D convolutions with very large kernel sizes can be efficiently implemented using FFT transformations.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. CUDA Separable Convolution
This sample implements a separable convolution filter of a 2D signal with a gaussian kernel.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Texture-based Separable Convolution
Texture-based implementation of a separable 2D convolution with a gaussian kernel. Used for performance comparison against convolutionSeparable.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon. Volume Rendering with 3D Textures
This sample demonstrates basic volume rendering using 3D Textures.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU

Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac