Name
NV_vertex_program1_1
Name Strings
GL_NV_vertex_program1_1
Contact
Mark J. Kilgard, NVIDIA Corporation (mjk 'at' nvidia.com)
Notice
Copyright NVIDIA Corporation, 2001, 2002.
IP Status
NVIDIA Proprietary.
Status
Version 1.0
Version
NVIDIA Date: September 3, 2002
Version: 7
Number
266
Dependencies
Written based on the wording of the OpenGL 1.2.1 specification and
requires OpenGL 1.2.1.
Assumes support for the NV_vertex_program extension.
Overview
This extension adds four new vertex program instructions (DPH,
RCC, SUB, and ABS).
This extension also supports a position-invariant vertex program
option. A vertex program is position-invariant when it generates
the _exact_ same homogenuous position and window space position
for a vertex as conventional OpenGL transformation (ignoring vertex
blending and weighting).
By default, vertex programs are _not_ guaranteed to be
position-invariant because there is no guarantee made that the way
a vertex program might compute its homogenous position is exactly
identical to the way conventional OpenGL transformation computes
its homogenous positions. In a position-invariant vertex program,
the homogeneous position (HPOS) is not output by the program.
Instead, the OpenGL implementation is expected to compute the HPOS
for position-invariant vertex programs in a manner exactly identical
to how the homogenous position and window position are computed
for a vertex by conventional OpenGL transformation. In this way
position-invariant vertex programs guarantee correct multi-pass
rendering semantics in cases where multiple passes are rendered and
the second and subsequent passes use a GL_EQUAL depth test.
Issues
How should options to the vertex program semantics be handled?
RESOLUTION: A VP1.1 vertex program can contain a sequence
of options. This extension provides a single option
("NV_position_invariant"). Specifying an option changes the
way the program's subsequent instruction sequence are parsed,
may add new semantic checks, and modifies the semantics by which
the vertex program is executed.
Should this extension provide SUB and ABS instructions even though
the functionality can be accomplished with ADD and MAX?
RESOLUTION: Yes. SUB and ABS provide no functionality that could
not be accomplished in VP1.0 with ADD and MAX idioms, SUB and ABS
provide more understanable vertex programs.
Should the optionalSign in a VP1.1 accept both "-" and "+"?
RESOLUTION: Yes. The "+" does not negate its operand but is
available for symetry.
Is relative addressing available to position-invariant version 1.1
vertex programs?
RESOLUTION: No. This reflects a hardware restriction.
Should something be said about the relative performance of
position-invariant vertex programs and conventional vertex programs?
RESOLUTION: For architectural reasons, position-invariant vertex
programs may be _slightly_ faster than conventional vertex programs.
This is true in the GeForce3 architecture. If your vertex program
transforms the object-space position to clip-space with four DP4
instructions using the tracked GL_MODELVIEW_PROJECTION_NV matrix,
consider using position-invariant vertex programs. Do not expect a
measurable performance improvement unless vertex program processing
is your bottleneck and your vertex program is relatively short.
Should position-invariant vertex programs have a lower limit on the
maximum instructions?
RESOLUTION: Yes, the driver takes care to match the same
instructions used for position transformation used by conventional
transformation and this requires a few vertex program instructions.
New Procedures and Functions
None.
New Tokens
None.
Additions to Chapter 2 of the OpenGL 1.2.1 Specification (OpenGL Operation)
2.14.1.9 Vertex Program Register Accesses
Replace the first two sentences and update Table X.4:
"There are 21 vertex program instructions. The instructions and their
respective input and output parameters are summarized in Table X.4."
Output
Inputs (vector or
Opcode (scalar or vector) replicated scalar) Operation
------ ------------------ ------------------ --------------------------
ARL s address register address register load
MOV v v move
MUL v,v v multiply
ADD v,v v add
MAD v,v,v v multiply and add
RCP s ssss reciprocal
RSQ s ssss reciprocal square root
DP3 v,v ssss 3-component dot product
DP4 v,v ssss 4-component dot product
DST v,v v distance vector
MIN v,v v minimum
MAX v,v v maximum
SLT v,v v set on less than
SGE v,v v set on greater equal than
EXP s v exponential base 2
LOG s v logarithm base 2
LIT v v light coefficients
DPH v,v ssss homogeneous dot product
RCC s ssss reciprocal clamped
SUB v,v v subtract
ABS v v absolute value
Table X.4: Summary of vertex program instructions. "v" indicates a
vector input or output, "s" indicates a scalar input, and "ssss" indicates
a scalar output replicated across a 4-component vector.
Add four new sections describing the DPH, RCC, SUB, and ABS
instructions.
"2.14.1.10.18 DPH: Homogeneous Dot Product
The DPH instruction assigns the four-component dot product of the
two source vectors where the W component of the first source vector
is assumed to be 1.0 into the destination register.
t.x = source0.c***;
t.y = source0.*c**;
t.z = source0.**c*;
if (negate0) {
t.x = -t.x;
t.y = -t.y;
t.z = -t.z;
}
u.x = source1.c***;
u.y = source1.*c**;
u.z = source1.**c*;
u.w = source1.***c;
if (negate1) {
u.x = -u.x;
u.y = -u.y;
u.z = -u.z;
u.w = -u.w;
}
v.x = t.x * u.x + t.y * u.y + t.z * u.z + u.w;
if (xmask) destination.x = v.x;
if (ymask) destination.y = v.x;
if (zmask) destination.z = v.x;
if (wmask) destination.w = v.x;
2.14.1.10.19 RCC: Reciprocal Clamped
The RCC instruction inverts the value of the source scalar, clamps
the result as described below, and stores the clamped result into
the destination register. The reciprocal of exactly 1.0 must be
exactly 1.0.
Additionally (before clamping) the reciprocal of negative infinity
gives [-0.0, -0.0, -0.0, -0.0]; the reciprocal of negative zero gives
[-Inf, -Inf, -Inf, -Inf]; the reciprocal of positive zero gives
[+Inf, +Inf, +Inf, +Inf]; and the reciprocal of positive infinity
gives [0.0, 0.0, 0.0, 0.0].
t.x = source0.c;
if (negate0) {
t.x = -t.x;
}
if (t.x == 1.0f) {
u.x = 1.0f;
} else {
u.x = 1.0f / t.x;
}
if (Positive(u.x)) {
if (u.x > 1.884467e+019) {
u.x = 1.884467e+019; // the IEEE 32-bit binary value 0x5F800000
} else if (u.x < 5.42101e-020) {
u.x = 5.42101e-020; // the IEEE 32-bit bindary value 0x1F800000
}
} else {
if (u.x < -1.884467e+019) {
u.x = -1.884467e+019; // the IEEE 32-bit binary value 0xDF800000
} else if (u.x > -5.42101e-020) {
u.x = -5.42101e-020; // the IEEE 32-bit binary value 0x9F800000
}
}
if (xmask) destination.x = u.x;
if (ymask) destination.y = u.x;
if (zmask) destination.z = u.x;
if (wmask) destination.w = u.x;
where Positive(x) is true for +0 and other positive values and false
for -0 and other negative values; and
| u.x - IEEE(1.0f/t.x) | < 1.0f/(2^22)
for 1.0f <= t.x <= 2.0f. The intent of this precision requirement is
that this amount of relative precision apply over all values of t.x."
2.14.1.10.20 SUB: Subtract
The SUB instruction subtracts the values of the one source vector
from another source vector and stores the result into the destination
register.
t.x = source0.c***;
t.y = source0.*c**;
t.z = source0.**c*;
t.w = source0.***c;
if (negate0) {
t.x = -t.x;
t.y = -t.y;
t.z = -t.z;
t.w = -t.w;
}
u.x = source1.c***;
u.y = source1.*c**;
u.z = source1.**c*;
u.w = source1.***c;
if (negate1) {
u.x = -u.x;
u.y = -u.y;
u.z = -u.z;
u.w = -u.w;
}
if (xmask) destination.x = t.x - u.x;
if (ymask) destination.y = t.y - u.y;
if (zmask) destination.z = t.z - u.z;
if (wmask) destination.w = t.w - u.w;
2.14.1.10.21 ABS: Absolute Value
The ABS instruction assigns the component-wise absolute value of a
source vector into the destination register.
t.x = source0.c***;
t.y = source0.*c**;
t.z = source0.**c*;
t.w = source0.***c;
if (xmask) destination.x = (t.x >= 0) ? t.x : -t.x;
if (ymask) destination.y = (t.y >= 0) ? t.y : -t.y;
if (zmask) destination.z = (t.z >= 0) ? t.z : -t.z;
if (wmask) destination.w = (t.w >= 0) ? t.w : -t.w;
Insert sections 2.14.A and 2.14.B after section 2.14.4
"2.14.A Version 1.1 Vertex Programs
Version 1.1 vertex programs provide support for the DPH, RCC, SUB,
and ABS instructions (see sections 2.14.1.10.18 through 2.14.1.10.21).
Version 1.1 vertex programs are loaded with the LoadProgramNV command
(see section 2.14.1.7). The target must be VERTEX_PROGRAM_NV to
load a version 1.1 vertex program. The initial "!!VP1.1" token
designates the program should be parsed and treated as a version 1.1
vertex program.
Version 1.1 programs must conform to a more expanded grammar than
the grammar for vertex programs. The version 1.1 vertex program
grammar for syntactically valid sequences is the same as the grammar
defined in section 2.14.1.7 with the following modified rules:
::= "!!VP1.1" "END"
::=