Graphics Reference
In-Depth Information
GPUs are typically able to perform many arithmetic logic unit (ALU) operations in the
time it takes to fetch content from device memory, and there is a somewhat smaller differ-
ence when reading from the GSM. Once again, a best practice would be to develop an algo-
rithm so it can either independently calculate the desired values, or share them through the
GSM, and then either profile the performance or dynamically decide which version to use.
5.5.2 Choose Appropriate Resource Types
Another very important consideration is the selection of the resource type that will be used.
Depending on the data being processed, one resource type may provide a better overall
algorithm design than others. The resource type will dictate how the resource is accessed
by a thread group, and will also dictate the actual thread group shape and sizes.
Memory Access Patterns
In some cases, the best resource type to choose is fairly obvious. For example, image
processing algorithms typically work with a two-dimensional texture, since that is the for-
mat that images are stored and used in. However, other algorithms may not be as clearly
defined. For example, when implementing a GPGPU algorithm, the data set can usually be
manipulated into whichever resource type makes the most sense. One of the most impor-
tant considerations in this regard is how the data must be accessed. Earlier, we mentioned
a particle system using append/consume buffers. Since the algorithm doesn't care about
the order the particles are processed in, it can use a buffer resource that allows the append/
consume functionality. Alternatively, if there are data sets that are not directly accessed,
but are rather spatially sampled, texture resources would be a much better choice. There are
also intrinsic functions, such as the gather function, that can return multiple values from
texture resources, but that can't be used on buffer resources. The resource type should be
chosen so that the algorithm can take advantage of all of the available built-in hardware
and software functionality.
Thread Group and Dispatch Size
Once a resource type has been selected, an appropriate threading pattern needs to be cho-
sen. In reality, this will probably be decided in conjunction with the resource type to select
the best method of accessing the needed data. The dimensions of a thread group will dictate
the thread addressing system that can be used to access a resource, both for reading input
data and for eventually writing output data. In addition, the total size of the resource must
be covered by the dispatch dimensions, which instantiates thread groups. Therefore, these
two dimensions can be chosen simultaneously to provide the simplest and most efficient
method of accessing a resource.
Search WWH ::




Custom Search