Ray Casting and Rasterization - Computer Graphics: Principles and Practice

Graphics Reference

In-Depth Information

There are of course more operations for which one might wish to provide an

abstracted interface. These include per-object and per-mesh transformations, tes-

sellation of curved patches into triangles, and per-triangle operations like silhou-

ette detection or surface extrusion. Various APIs offer abstractions of these within

a programming model similar to vertex and pixel shaders.

Chapter 38 discusses how GPUs are designed to execute this pipeline effi-

ciently. Also refer to your API manual for a discussion of the additional stages

(e.g., tessellate, geometry) that may be available.

15.7.2 Interface

The interface to a software rasterization API can be very simple. Because a soft-

ware rasterizer uses the same memory space and execution model as the host pro-

gram, one can pass the scene as a pointer and the callbacks as function pointers or

classes with virtual methods. Rather than individual triangles, it is convenient to

pass whole meshes to a software rasterizer to decrease the per-triangle overhead.

For a hardware rasterization API, the host machine (i.e., CPU) and graphics

device (i.e., GPU) may have separate memory spaces and execution models. In

this case, shared memory and function pointers no longer suffice. Hardware ras-

terization APIs therefore must impose an explicit memory boundary and narrow

entry points for negotiating it. (This is also true of the fallback and reference soft-

ware implementations of those APIs, such as Mesa and DXRefRast.) Such an API

requires the following entry points, which are detailed in subsequent subsections.

1. Allocate device memory.

2. Copy data between host and device memory.

3. Free device memory.

4. Load (and compile) a shading program from source.

5. Configure the output merger and other fixed-function state.

6. Bind a shading program and set its arguments.

7. Launch a draw call, a set of device threads to render a triangle list.

15.7.2.1 Memory Principles

The memory management routines are conceptually straightforward. They

correspond to malloc , memcpy , and free , and they are typically applied to large

arrays, such as an array of vertex data. They are complicated by the details neces-

sary to achieve high performance for the case where data must be transferred per

rendered frame, rather than once per scene. This occurs when streaming geome-

try for a scene that is too large for the device memory; for example, in a world

large enough that the viewer can only ever observe a small fraction at a time. It

also occurs when a data stream from another device, such as a camera, is an input

to the rendering algorithm. Furthermore, hybrid software-hardware rendering and

physics algorithms perform some processing on each of the host and device and

must communicate each frame.

One complicating factor for memory transfer is that it is often desirable to

adjust the data layout and precision of arrays during the transfer. The data struc-

ture for 2D buffers such as images and depth buffers on the host often resembles

the “linear,” row-major ordering that we have used in this chapter. On a graph-

ics processor, 2D buffers are often wrapped along Hilbert or Z-shaped (Morton)

Computer Graphics: Principles and Practice

Search WWH ::

Custom Search

Home