-------------------------------------------------------------------------------- -------------------------------------------------------------------------------- NVIDIA CUDA Windows XP and Vista Release Notes Version 2.1 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- New Features -------------------------------------------------------------------------------- Hardware Support o TESLA devices are now supported on Windows Vista o See http://www.nvidia.com/object/cuda_learn_products.html API Features o PTX JIT API - cuModuleLoadDataEx o New Direct3D 9 interoperability API - cuD3D9ResourceGetMappedArray o Direct3D 10 interoperability API - cuD3D10GetDevice - cuD3D10CtxCreate - cuD3D10RegisterResource - cuD3D10UnregisterResource - cuD3D10MapResources - cuD3D10UnmapResources - cuD3D10ResourceSetMapFlags - cuD3D10ResourceGetMappedArray - cuD3D10ResourceGetMappedPointer - cuD3D10ResourceGetMappedSize - cuD3D10ResourceGetMappedPitch - cuD3D10ResourceGetSurfaceDimensions Compiler support o Additional compiler support - Microsoft Visual C++ 9 o Eliminated compiler support - Microsoft Visual C++ 7 -------------------------------------------------------------------------------- Major Bug Fixes -------------------------------------------------------------------------------- o OpenGL interoperability will now only copy shared buffers through host memory when CUDA and OpenGL are running on different GPUs. -------------------------------------------------------------------------------- Known Issues -------------------------------------------------------------------------------- Vista Specific Issues: o In order to run CUDA on a non-TESLA GPU, either the Windows desktop must be extended onto the GPU, or the GPU must be selected as the PhysX GPU. o Individual kernels are limited to a 2-second runtime by Windows Vista. Kernels that run for longer than 2 seconds will trigger the Timeout Detection and Recovery (TDR) mechanism. For more information, see http://www.microsoft.com/whdc/device/display/wddm_timeout.mspx. o The CUDA Profiler does not support performance counter events on Windows Vista. All profiler configuration regarding performance counter events is ignored. o On Windows Vista, asynchronous memory copies do not support GPU overlap. CU_DEVICE_ATTRIBUTE_GPU_OVERLAP will be 0 for all devices. o The maximum size of a single allocation created by cudaMalloc or cuMemAlloc is limited to ( System Memory Size in MB - 512 MB ) / 2. XP Specific Issues: o Individual GPU program launches are limited to a run time of less than 5 seconds on a GPU with a display attached. Exceeding this time limit usually causes a launch failure reported through the CUDA driver or the CUDA runtime. GPUs without a display attached are not subject to the 5 second runtime restriction. For this reason it is recommended that CUDA be run on a GPU that is NOT attached to a display and does not have the Windows desktop extended onto it. In this case, the system must contain at least one NVIDIA GPU that serves as the primary graphics adapter. Issues Common to XP and Vista: o GPU enumeration order on multi-GPU systems is non-deterministic and may change with this or future releases. Users should make sure to enumerate all CUDA-capable GPUs in the system and select the most appropriate one(s) to use. o Applications that try to use too much memory may cause a CUDA memcopy or kernel to fail with the error CUDA_ERROR_OUT_OF_MEMORY. If this happens, the CUDA Context is placed into an error state and must be destroyed and recreated if the application wants to continue using CUDA. o Malloc may fail due to running out of virtual memory space. The address space limitation is fixed by a Microsoft issued hotfix. Please install the patch located at http://support.microsoft.com/kb/940105 if this is an issue. Windows Vista SP1 includes this hotfix. o When two GPUs are run in SLI mode, only one of the GPUs will be available to the user for executing CUDA programs. o When using Microsoft Studio Visual 8.0, it is required that Service Pack 1 be installed. Certain Windows C++ header files will cause a crash in cudafe without it. o The default compilation mode for host code is now C++. To restore the old behavior, use the option --host-compilation=c o For maximum performance when using multiple byte sizes to access the same data, coalesce adjacent loads and stores when possible rather than using a union or individual byte accesses. Accessing the data via a union may result in the compiler reserving extra memory for the object, and accessing the data as individual bytes may result in non-coalesced accesses. This will be improved in a future compiler release. -------------------------------------------------------------------------------- Open64 Sources -------------------------------------------------------------------------------- The Open64 source files are controlled under terms of the GPL license. Current and previously released versions are located via anonymous ftp at download.nvidia.com in the CUDAOpen64 directory. ------------------------------------------------------------------------------- Revision History -------------------------------------------------------------------------------- 11/2008 - Version 2.1 Beta 06/2008 - Version 2.0 11/2007 - Version 1.1 06/2007 - Version 1.0 06/2007 - Version 0.9 02/2007 - Version 0.8 - Initial public Beta -------------------------------------------------------------------------------- More Information -------------------------------------------------------------------------------- For more information and help with CUDA, please visit http://www.nvidia.com/cuda