The CUDA Developer SDK provides examples with source code, utilities, and white papers to help you get started writing software with CUDA. The SDK includes dozens of code samples covering a wide range of applications including:
This code is released free of charge for use in derivative works, whether academic, commercial, or personal. (Full License)
The NVIDIA CUDA Toolkit is required to run and compile code samples. Please obtain the CUDA Toolkit here
|
Monte-Carlo Option Pricing with multi-GPU support ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample evaluates fair call price for a given set of European options using Monte-Carlo approach, taking advantage of all CUDA-capable GPUs installed in the system. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Download - Windows
Download - Linux
|
|
|
FFT Ocean Simulation ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample simulates an Ocean heightfield using CUFFT and renders the result using OpenGL. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Download - Windows
Download - Linux
|
|
|
256-bin Histogram ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample demonstrates efficient implementation of 256-bin histogram. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
64-bin Histogram ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample demonstrates efficient implementation of 64-bin histogram. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
FFT-Based 2D Convolution ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample demonstrates how 2D convolutions with very large kernel sizes can be efficiently implemented using FFT transformations. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
MersenneTwister ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample implements Mersenne Twister random number generator and Cartesian Box-Muller transformation on the GPU. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
Monte-Carlo Option Pricing ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample evaluates fair call price for a given set of European options using Monte-Carlo approach. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
Binomial Option Pricing ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample evaluates fair call price for a given set of European options under binomial model. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
Image denoising ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample demonstrates two adaptive image denoising technqiues: KNN and NLM, based on computation of both geometric and color distance between texels. While both techniques are implemented in the DirectX SDK using shaders, massively speeded up variation of the latter techique, taking advantage of shared memory, is implemented in addition to DirectX counterparts. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
DirectX Texture Compressor (DXTC) ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
High Quality DXT Compression using CUDA.
This example shows how to implement an existing computationally-intensive CPU compression algorithm in parallel on the GPU, and obtain an order of magnitude performance improvement. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
N-Body Simulation ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample demonstrates efficient all-pairs simulation of a gravitational n-body simulation in CUDA. This sample accompanies the GPU Gems 3 chapter "Fast N-Body Simulation with CUDA". |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
Parallel Reduction ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
A parallel sum reduction that computes the sum of large arrays of values. This sample demonstrates several important optimization stratezies for parallel algorithms like reduction. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
Fast Walsh Transform ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
Naturally(Hadamard)-ordered Fast Walsh Tranform for batched vectors of arbitrary eligible(power of two) lengths |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Download - Windows
Download - Linux
|
|
|
Eigenvalues ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
The computation of all or a subset of all eigenvalues is an important problem in linear algebra, statistics, physics, and many other fields. This sample demonstrates a parallel implementation of a bisection algorithm for the computation of all eigenvalues of a
tridiagonal symmetric matrix of arbitrary size with CUDA. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
Sobel Filter ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This sample implements the Sobel edge detection filter for 8-bit monochrome images. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Download - Windows
Download - Linux
|
|
|
Scan ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This example demonstrates an efficient CUDA implementation of parallel prefix sum, also known as "scan". Given an array of numbers, scan computes a new array in which each element is the sum of all the elements before it in the input array. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
Scan of Large Arrays ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
This example demonstrates an efficient CUDA implementation of parallel prefix sum (also known as "scan") for arbitrary-sized arrays. Given an array of numbers, scan computes a new array in which each element is the sum of all the elements before it in the input array. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Whitepaper
Download - Windows
Download - Linux
|
|
|
Fluids (OpenGL Version) ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
An example of fluid simulation using CUDA and CUFFT, with OpenGL rendering. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Download - Windows
Download - Linux
|
|
|
Fluids (Direct3D Version) ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
An example of fluid simulation using CUDA and CUFFT, with Direct3D 9 rendering. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Download - Windows
|
|
|
1D Discrete Haar Wavelet Decomposition ![For a direct link to this sample, right-click and copy the URL (shortcut) of this link icon.](images/link.jpg)
Discrete Haar wavelet decomposition for 1D signals with a length which is a power of 2. |
|
![Minimum Required GPU](images/GEF8_2D_wte.gif)
or later
Download - Windows
Download - Linux
|
|