SLI Zone
NVIDIA.com Developer Home

NVIDIA OpenCL SDK - Data-Parallel Algorithms

The GPU Computing SDK provides examples with source code, utilities, and white papers to help you get started writing GPU Computing software. The full SDK includes dozens of code samples covering a wide range of applications.

Refer to the following README for related SDK information ( README )

The latest NVIDIA display drivers are required to run code samples. Please obtain the latest display driver here.

The NVIDIA CUDA Toolkit is required to compile code samples. Please obtain the CUDA Toolkit from CUDA Zone.

Select the category to view:

OpenCL Vector Addition For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.

Element by element addition of two 1-dimensional arrays. Implemented in OpenCL for CUDA GPU's, with functional comparison against a simple C++ host CPU implementation.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU




Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


OpenCL Dot Product For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.

Dot Product (scalar product) of set of input vector pairs. Implemented in OpenCL for CUDA GPU's, with functional comparison against a simple C++ host CPU implementation.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU




Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


OpenCL Matrix Vector Multiplication For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.

Simple matrix-vector multiplication example showing increasingly optimized implementations.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU




Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


OpenCL Scan For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.

This example demonstrates an efficient OpenCL implementation of parallel prefix sum, also known as "scan". Given an array of numbers, scan computes a new array in which each element is the sum of all the elements before it in the input array.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU




Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


Parallel Reduction For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.

A parallel sum reduction that computes the sum of large arrays of values. This sample demonstrates several important optimization strategies for parallel algorithms like reduction.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU




Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


OpenCL Sorting Networks For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.

This sample implements bitonic sort algorithm for batches of short arrays
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU




Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


OpenCL 64-bin and 256-bin Histogram For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.

This sample demonstrates efficient implementation of 64-bin and 256-bin histograms.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU



Whitepaper
Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


OpenCL Separable Convolution For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.

This sample implements convolution filter of a 2D image with arbitrary separable kernel.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU




Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


OpenCL Particle Collision Simulation For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.

Simulation of elastic collisions of a large # of bodies. Implemented in OpenCL for CUDA GPU's.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU




Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac


OpenCL N-Body Physics Simulation For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.

Gravitational Simulation of a large # of bodies. Implemented in OpenCL for CUDA GPU's.
  Minimum Required GPU
Minimum Required GPUor later
Minimum Required GPU




Download - Windows (x86)
Download - Windows (x64)
Download - Linux/Mac

Last Update: 2/28/2010
NVPerfHUD 4