This document includes math equations (highlighted in red) which are best viewed with Firefox version 4.0 or higher, or another MathML-aware browser. There is also a PDF version of this document.
CUDA Math API (PDF) - CUDA Toolkit v5.5 (older) - Last updated May 11, 2013 - Send Feedback

1.4. Single Precision Intrinsics

This section describes single precision intrinsic functions that are only supported in device code.

Functions

__device__ ​ __cudart_builtin__ float __cosf ( float  x )
Calculate the fast approximate cosine of the input argument.
__device__ ​ __cudart_builtin__ float __exp10f ( float  x )
Calculate the fast approximate base 10 exponential of the input argument.
__device__ ​ __cudart_builtin__ float __expf ( float  x )
Calculate the fast approximate base e exponential of the input argument.
__device__ ​ float __fadd_rd ( float  x, float  y )
Add two floating point values in round-down mode.
__device__ ​ float __fadd_rn ( float  x, float  y )
Add two floating point values in round-to-nearest-even mode.
__device__ ​ float __fadd_ru ( float  x, float  y )
Add two floating point values in round-up mode.
__device__ ​ float __fadd_rz ( float  x, float  y )
Add two floating point values in round-towards-zero mode.
__device__ ​ float __fdiv_rd ( float  x, float  y )
Divide two floating point values in round-down mode.
__device__ ​ float __fdiv_rn ( float  x, float  y )
Divide two floating point values in round-to-nearest-even mode.
__device__ ​ float __fdiv_ru ( float  x, float  y )
Divide two floating point values in round-up mode.
__device__ ​ float __fdiv_rz ( float  x, float  y )
Divide two floating point values in round-towards-zero mode.
__device__ ​ float __fdividef ( float  x, float  y )
Calculate the fast approximate division of the input arguments.
__device__ ​ float __fmaf_rd ( float  x, float  y, float  z )
Compute x × y + z as a single operation, in round-down mode.
__device__ ​ float __fmaf_rn ( float  x, float  y, float  z )
Compute x × y + z as a single operation, in round-to-nearest-even mode.
__device__ ​ float __fmaf_ru ( float  x, float  y, float  z )
Compute x × y + z as a single operation, in round-up mode.
__device__ ​ float __fmaf_rz ( float  x, float  y, float  z )
Compute x × y + z as a single operation, in round-towards-zero mode.
__device__ ​ float __fmul_rd ( float  x, float  y )
Multiply two floating point values in round-down mode.
__device__ ​ float __fmul_rn ( float  x, float  y )
Multiply two floating point values in round-to-nearest-even mode.
__device__ ​ float __fmul_ru ( float  x, float  y )
Multiply two floating point values in round-up mode.
__device__ ​ float __fmul_rz ( float  x, float  y )
Multiply two floating point values in round-towards-zero mode.
__device__ ​ float __frcp_rd ( float  x )
Compute 1 x in round-down mode.
__device__ ​ float __frcp_rn ( float  x )
Compute 1 x in round-to-nearest-even mode.
__device__ ​ float __frcp_ru ( float  x )
Compute 1 x in round-up mode.
__device__ ​ float __frcp_rz ( float  x )
Compute 1 x in round-towards-zero mode.
__device__ ​ float __frsqrt_rn ( float  x )
Compute 1 / x in round-to-nearest-even mode.
__device__ ​ float __fsqrt_rd ( float  x )
Compute x in round-down mode.
__device__ ​ float __fsqrt_rn ( float  x )
Compute x in round-to-nearest-even mode.
__device__ ​ float __fsqrt_ru ( float  x )
Compute x in round-up mode.
__device__ ​ float __fsqrt_rz ( float  x )
Compute x in round-towards-zero mode.
__device__ ​ float __fsub_rd ( float  x, float  y )
Subtract two floating point values in round-down mode.
__device__ ​ float __fsub_rn ( float  x, float  y )
Subtract two floating point values in round-to-nearest-even mode.
__device__ ​ float __fsub_ru ( float  x, float  y )
Subtract two floating point values in round-up mode.
__device__ ​ float __fsub_rz ( float  x, float  y )
Subtract two floating point values in round-towards-zero mode.
__device__ ​ __cudart_builtin__ float __log10f ( float  x )
Calculate the fast approximate base 10 logarithm of the input argument.
__device__ ​ __cudart_builtin__ float __log2f ( float  x )
Calculate the fast approximate base 2 logarithm of the input argument.
__device__ ​ __cudart_builtin__ float __logf ( float  x )
Calculate the fast approximate base e logarithm of the input argument.
__device__ ​ __cudart_builtin__ float __powf ( float  x, float  y )
Calculate the fast approximate of x y .
__device__ ​ float __saturatef ( float  x )
Clamp the input argument to [+0.0, 1.0].
__device__ ​ __cudart_builtin__ void __sincosf ( float  x, float* sptr, float* cptr )
Calculate the fast approximate of sine and cosine of the first input argument.
__device__ ​ __cudart_builtin__ float __sinf ( float  x )
Calculate the fast approximate sine of the input argument.
__device__ ​ __cudart_builtin__ float __tanf ( float  x )
Calculate the fast approximate tangent of the input argument.

Functions

__device__ ​ __cudart_builtin__ float __cosf ( float  x )

Calculate the fast approximate cosine of the input argument. Calculate the fast approximate cosine of the input argument x, measured in radians.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

  • Input and output in the denormal range is flushed to sign preserving 0.0.

Returns

Returns the approximate cosine of x.

__device__ ​ __cudart_builtin__ float __exp10f ( float  x )

Calculate the fast approximate base 10 exponential of the input argument. Calculate the fast approximate base 10 exponential of the input argument x, 10 x .

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

  • Most input and output values around denormal range are flushed to sign preserving 0.0.

Returns

Returns an approximation to 10 x .

__device__ ​ __cudart_builtin__ float __expf ( float  x )

Calculate the fast approximate base e exponential of the input argument. Calculate the fast approximate base e exponential of the input argument x, e x .

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

  • Most input and output values around denormal range are flushed to sign preserving 0.0.

Returns

Returns an approximation to e x .

__device__ ​ float __fadd_rd ( float  x, float  y )

Add two floating point values in round-down mode. Compute the sum of x and y in round-down (to negative infinity) mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x + y.

__device__ ​ float __fadd_rn ( float  x, float  y )

Add two floating point values in round-to-nearest-even mode. Compute the sum of x and y in round-to-nearest-even rounding mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x + y.

__device__ ​ float __fadd_ru ( float  x, float  y )

Add two floating point values in round-up mode. Compute the sum of x and y in round-up (to positive infinity) mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x + y.

__device__ ​ float __fadd_rz ( float  x, float  y )

Add two floating point values in round-towards-zero mode. Compute the sum of x and y in round-towards-zero mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x + y.

__device__ ​ float __fdiv_rd ( float  x, float  y )

Divide two floating point values in round-down mode. Divide two floating point values x by y in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns x / y.

__device__ ​ float __fdiv_rn ( float  x, float  y )

Divide two floating point values in round-to-nearest-even mode. Divide two floating point values x by y in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns x / y.

__device__ ​ float __fdiv_ru ( float  x, float  y )

Divide two floating point values in round-up mode. Divide two floating point values x by y in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns x / y.

__device__ ​ float __fdiv_rz ( float  x, float  y )

Divide two floating point values in round-towards-zero mode. Divide two floating point values x by y in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns x / y.

__device__ ​ float __fdividef ( float  x, float  y )

Calculate the fast approximate division of the input arguments. Calculate the fast approximate division of x by y.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

Returns

Returns x / y.

  • __fdividef( , y) returns NaN for 2 126 < y < 2 128 .
  • __fdividef(x, y) returns 0 for 2 126 < y < 2 128 and x .

__device__ ​ float __fmaf_rd ( float  x, float  y, float  z )

Compute x × y + z as a single operation, in round-down mode. Computes the value of x × y + z as a single ternary operation, rounding the result once in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns the rounded value of x × y + z as a single operation.

  • fmaf( ± , ± 0 , z) returns NaN.
  • fmaf( ± 0 , ± , z) returns NaN.
  • fmaf(x, y, ) returns NaN if x × y is an exact + .
  • fmaf(x, y, + ) returns NaN if x × y is an exact .

__device__ ​ float __fmaf_rn ( float  x, float  y, float  z )

Compute x × y + z as a single operation, in round-to-nearest-even mode. Computes the value of x × y + z as a single ternary operation, rounding the result once in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns the rounded value of x × y + z as a single operation.

  • fmaf( ± , ± 0 , z) returns NaN.
  • fmaf( ± 0 , ± , z) returns NaN.
  • fmaf(x, y, ) returns NaN if x × y is an exact + .
  • fmaf(x, y, + ) returns NaN if x × y is an exact .

__device__ ​ float __fmaf_ru ( float  x, float  y, float  z )

Compute x × y + z as a single operation, in round-up mode. Computes the value of x × y + z as a single ternary operation, rounding the result once in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns the rounded value of x × y + z as a single operation.

  • fmaf( ± , ± 0 , z) returns NaN.
  • fmaf( ± 0 , ± , z) returns NaN.
  • fmaf(x, y, ) returns NaN if x × y is an exact + .
  • fmaf(x, y, + ) returns NaN if x × y is an exact .

__device__ ​ float __fmaf_rz ( float  x, float  y, float  z )

Compute x × y + z as a single operation, in round-towards-zero mode. Computes the value of x × y + z as a single ternary operation, rounding the result once in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns the rounded value of x × y + z as a single operation.

  • fmaf( ± , ± 0 , z) returns NaN.
  • fmaf( ± 0 , ± , z) returns NaN.
  • fmaf(x, y, ) returns NaN if x × y is an exact + .
  • fmaf(x, y, + ) returns NaN if x × y is an exact .

__device__ ​ float __fmul_rd ( float  x, float  y )

Multiply two floating point values in round-down mode. Compute the product of x and y in round-down (to negative infinity) mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x * y.

__device__ ​ float __fmul_rn ( float  x, float  y )

Multiply two floating point values in round-to-nearest-even mode. Compute the product of x and y in round-to-nearest-even mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x * y.

__device__ ​ float __fmul_ru ( float  x, float  y )

Multiply two floating point values in round-up mode. Compute the product of x and y in round-up (to positive infinity) mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x * y.

__device__ ​ float __fmul_rz ( float  x, float  y )

Multiply two floating point values in round-towards-zero mode. Compute the product of x and y in round-towards-zero mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x * y.

__device__ ​ float __frcp_rd ( float  x )

Compute 1 x in round-down mode. Compute the reciprocal of x in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns 1 x .

__device__ ​ float __frcp_rn ( float  x )

Compute 1 x in round-to-nearest-even mode. Compute the reciprocal of x in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns 1 x .

__device__ ​ float __frcp_ru ( float  x )

Compute 1 x in round-up mode. Compute the reciprocal of x in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns 1 x .

__device__ ​ float __frcp_rz ( float  x )

Compute 1 x in round-towards-zero mode. Compute the reciprocal of x in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns 1 x .

__device__ ​ float __frsqrt_rn ( float  x )

Compute 1 / x in round-to-nearest-even mode. Compute the reciprocal square root of x in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns 1 / x .

__device__ ​ float __fsqrt_rd ( float  x )

Compute x in round-down mode. Compute the square root of x in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns x .

__device__ ​ float __fsqrt_rn ( float  x )

Compute x in round-to-nearest-even mode. Compute the square root of x in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns x .

__device__ ​ float __fsqrt_ru ( float  x )

Compute x in round-up mode. Compute the square root of x in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns x .

__device__ ​ float __fsqrt_rz ( float  x )

Compute x in round-towards-zero mode. Compute the square root of x in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

Returns

Returns x .

__device__ ​ float __fsub_rd ( float  x, float  y )

Subtract two floating point values in round-down mode. Compute the difference of x and y in round-down (to negative infinity) mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x - y.

__device__ ​ float __fsub_rn ( float  x, float  y )

Subtract two floating point values in round-to-nearest-even mode. Compute the difference of x and y in round-to-nearest-even rounding mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x - y.

__device__ ​ float __fsub_ru ( float  x, float  y )

Subtract two floating point values in round-up mode. Compute the difference of x and y in round-up (to positive infinity) mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x - y.

__device__ ​ float __fsub_rz ( float  x, float  y )

Subtract two floating point values in round-towards-zero mode. Compute the difference of x and y in round-towards-zero mode.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-1.

  • This operation will never be merged into a single multiply-add instruction.

Returns

Returns x - y.

__device__ ​ __cudart_builtin__ float __log10f ( float  x )

Calculate the fast approximate base 10 logarithm of the input argument. Calculate the fast approximate base 10 logarithm of the input argument x.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

  • Most input and output values around denormal range are flushed to sign preserving 0.0.

Returns

Returns an approximation to log 10 ( x ) .

__device__ ​ __cudart_builtin__ float __log2f ( float  x )

Calculate the fast approximate base 2 logarithm of the input argument. Calculate the fast approximate base 2 logarithm of the input argument x.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

  • Input and output in the denormal range is flushed to sign preserving 0.0.

Returns

Returns an approximation to log 2 ( x ) .

__device__ ​ __cudart_builtin__ float __logf ( float  x )

Calculate the fast approximate base e logarithm of the input argument. Calculate the fast approximate base e logarithm of the input argument x.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

  • Most input and output values around denormal range are flushed to sign preserving 0.0.

Returns

Returns an approximation to log e ( x ) .

__device__ ​ __cudart_builtin__ float __powf ( float  x, float  y )

Calculate the fast approximate of x y . Calculate the fast approximate of x, the first input argument, raised to the power of y, the second input argument, x y .

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

  • Most input and output values around denormal range are flushed to sign preserving 0.0.

Returns

Returns an approximation to x y .

__device__ ​ float __saturatef ( float  x )

Clamp the input argument to [+0.0, 1.0]. Clamp the input argument x to be within the interval [+0.0, 1.0].

Returns

  • __saturatef(x) returns 0 if x < 0.
  • __saturatef(x) returns 1 if x > 1.
  • __saturatef(x) returns x if 0 x 1 .
  • __saturatef(NaN) returns 0.

__device__ ​ __cudart_builtin__ void __sincosf ( float  x, float* sptr, float* cptr )

Calculate the fast approximate of sine and cosine of the first input argument. Calculate the fast approximate of sine and cosine of the first input argument x (measured in radians). The results for sine and cosine are written into the second argument, sptr, and, respectively, third argument, cptr.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

  • Denorm input/output is flushed to sign preserving 0.0.

Returns

  • none

__device__ ​ __cudart_builtin__ float __sinf ( float  x )

Calculate the fast approximate sine of the input argument. Calculate the fast approximate sine of the input argument x, measured in radians.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

  • Input and output in the denormal range is flushed to sign preserving 0.0.

Returns

Returns the approximate sine of x.

__device__ ​ __cudart_builtin__ float __tanf ( float  x )

Calculate the fast approximate tangent of the input argument. Calculate the fast approximate tangent of the input argument x, measured in radians.

Note:
  • For accuracy information for this function see the CUDA C Programming Guide, Appendix C, Table C-4.

  • The result is computed as the fast divide of __sinf() by __cosf(). Denormal input and output are flushed to sign-preserving 0.0 at each step of the computation.

Returns

Returns the approximate tangent of x.


CUDA Math API (PDF) - CUDA Toolkit v5.5 (older) - Last updated May 11, 2013 - Send Feedback