Graphics Reference
In-Depth Information
curves, or at least grouped into small blocks that are themselves row-major (i.e.,
“block-linear”), to avoid the cache penalty of vertical iteration. The origin of a
buffer may differ, and often additional padding is required to ensure that rows
have specific memory alignments for wide vector operations and reduced pointer
size.
Another complicating factor for memory transfer is that one would often like
to overlap computation with memory operations to avoid stalling either the host or
device. Asynchronous transfers are typically accomplished by semantically map-
ping device memory into the host address space. Regular host memory operations
can then be performed as if both shared a memory space. In this case the program-
mer must manually synchronize both host and device programs to ensure that data
is never read by one while being written by the other. Mapped memory is typi-
cally uncached and often has alignment considerations, so the programmer must
furthermore be careful to control access patterns.
Note that memory transfers are intended for large data. For small values, such
as scalars, 4
4 matrices, and even short arrays, it would be burdensome to explic-
itly allocate, copy, and free the values. For a shading program with twenty or so
arguments, that would incur both runtime and software management overhead. So
small values are often passed through a different API associated with shaders.
×
15.7.2.2 Memory Practice
Listing 15.30 shows part of an implementation of a triangle mesh class. Making
rendering calls to transfer individual triangles from the host to the graphics device
would be inefficient. So, the API forces us to load a large array of the geometry
to the device once when the scene is created, and to encode that geometry as
efficiently as possible.
Few programmers write directly to hardware graphics APIs. Those APIs reflect
the fact that they are designed by committees and negotiated among vendors. They
provide the necessary functionality but do so through awkward interfaces that
obscure the underlying function of the calling code. Usage is error-prone because
the code operates directly on pointers and uses manually managed memory.
For example, in OpenGL, the code to allocate a device array and bind it to a
shader input looks something like Listing 15.29. Most programmers abstract these
direct host calls into a vendor-independent, easier-to-use interface.
Listing 15.29: Host code for transferring an array of vertices to the device
and binding it to a shader input.
1
2
3
4
5
6
7
8
9
10
// Allocate memory:
GLuint vbo;
glGenBuffers(1, &vbo);
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glBufferData(GL_ARRAY_BUFFER, hostVertex.size() * 2 * sizeof ( Vector3 ), NULL,GL_STATIC_DRAW);
GLvoid * deviceVertex = 0;
GLvoid * deviceNormal = hostVertex.size() * sizeof ( Vector3 );
// Copy memory:
glBufferSubData(GL_ARRAY_BUFFER, deviceVertex, hostVertex.size() *
sizeof ( Point3 ), &hostVertex[0]);
11
12
13
14
15
// Bind the array to a shader input:
int vertexIndex = glGetAttribLocation(shader, " vertex ");
glEnableVertexAttribArray(vertexIndex);
glVertexAttribPointer(vertexIndex, 3, GL_FLOAT, GL_FALSE, 0, deviceVertex);
 
 
Search WWH ::




Custom Search