cudaError_t cudaGetDeviceProperties ( struct cudaDeviceProp prop,
int  device 
)

Returns in *prop the properties of device dev. The cudaDeviceProp structure is defined as:

where:
  • name[256] is an ASCII string identifying the device;
  • totalGlobalMem is the total amount of global memory available on the device in bytes;
  • sharedMemPerBlock is the maximum amount of shared memory available to a thread block in bytes; this amount is shared by all thread blocks simultaneously resident on a multiprocessor;
  • regsPerBlock is the maximum number of 32-bit registers available to a thread block; this number is shared by all thread blocks simultaneously resident on a multiprocessor;
  • warpSize is the warp size in threads;
  • memPitch is the maximum pitch in bytes allowed by the memory copy functions that involve memory regions allocated through cudaMallocPitch();
  • maxThreadsPerBlock is the maximum number of threads per block;
  • maxThreadsDim[3] contains the maximum size of each dimension of a block;
  • maxGridSize[3] contains the maximum size of each dimension of a grid;
  • clockRate is the clock frequency in kilohertz;
  • totalConstMem is the total amount of constant memory available on the device in bytes;
  • major, minor are the major and minor revision numbers defining the device's compute capability;
  • textureAlignment is the alignment requirement; texture base addresses that are aligned to textureAlignment bytes do not need an offset applied to texture fetches;
  • deviceOverlap is 1 if the device can concurrently copy memory between host and device while executing a kernel, or 0 if not. Deprecated, use instead asyncEngineCount.
  • multiProcessorCount is the number of multiprocessors on the device;
  • kernelExecTimeoutEnabled is 1 if there is a run time limit for kernels executed on the device, or 0 if not.
  • integrated is 1 if the device is an integrated (motherboard) GPU and 0 if it is a discrete (card) component.
  • canMapHostMemory is 1 if the device can map host memory into the CUDA address space for use with cudaHostAlloc()/cudaHostGetDevicePointer(), or 0 if not;
  • computeMode is the compute mode that the device is currently in. Available modes are as follows:
    • cudaComputeModeDefault: Default mode - Device is not restricted and multiple threads can use cudaSetDevice() with this device.
    • cudaComputeModeExclusive: Compute-exclusive mode - Only one thread will be able to use cudaSetDevice() with this device.
    • cudaComputeModeProhibited: Compute-prohibited mode - No threads can use cudaSetDevice() with this device.
    • cudaComputeModeExclusiveProcess: Compute-exclusive-process mode - Many threads in one process will be able to use cudaSetDevice() with this device. Any errors from calling cudaSetDevice() with an exclusive (and occupied) or prohibited device will only show up after a non-device management runtime function is called. At that time, cudaErrorNoDevice will be returned.
  • maxTexture1D is the maximum 1D texture size.
  • maxTexture2D[2] contains the maximum 2D texture dimensions.
  • maxTexture3D[3] contains the maximum 3D texture dimensions.
  • maxTexture1DLayered[2] contains the maximum 1D layered texture dimensions.
  • maxTexture2DLayered[3] contains the maximum 2D layered texture dimensions.
  • surfaceAlignment specifies the alignment requirements for surfaces.
  • concurrentKernels is 1 if the device supports executing multiple kernels within the same context simultaneously, or 0 if not. It is not guaranteed that multiple kernels will be resident on the device concurrently so this feature should not be relied upon for correctness;
  • ECCEnabled is 1 if the device has ECC support turned on, or 0 if not.
  • pciBusID is the PCI bus identifier of the device.
  • pciDeviceID is the PCI device (sometimes called slot) identifier of the device.
  • pciDomainID is the PCI domain identifier of the device.
  • tccDriver is 1 if the device is using a TCC driver or 0 if not.
  • asyncEngineCount is 1 when the device can concurrently copy memory between host and device while executing a kernel. It is 2 when the device can concurrently copy memory between host and device in both directions and execute a kernel at the same time. It is 0 if neither of these is supported.
  • unifiedAddressing is 1 if the device shares a unified address space with the host and 0 otherwise.
  • memoryClockRate is the peak memory clock frequency in kilohertz.
  • memoryBusWidth is the memory bus width in bits.
  • l2CacheSize is L2 cache size in bytes.
  • maxThreadsPerMultiProcessor is the number of maximum resident threads per multiprocessor.

Parameters:
prop - Properties for the specified device
device - Device number to get properties for
Returns:
cudaSuccess, cudaErrorInvalidDevice
See also:
cudaGetDeviceCount, cudaGetDevice, cudaSetDevice, cudaChooseDevice


Generated by Doxygen for NVIDIA CUDA Library  NVIDIA