Graphics Reference
In-Depth Information
textured, their 2D texel coordinates typically cluster into a small region in the tex-
ture image. Data locality will be greater if the 1D memory addresses of the texels
that are accessed also cluster into a small band of locations.
Let ( x , y ) be the 2D integer coordinates of a texel in a square texture image of
dimension w , and a 0 be the base address of the texture data in memory. Then an
obvious mapping would be raster order, specified as
a = a 0 + x + w
·
y .
(38.1)
Unfortunately Equation 38.1 results in a single tight cluster of memory addresses
only if y is single-valued. If the patch of texels is not on a single scanline of the
texture image, different values of y produce smaller clusters of addresses, them-
selves separated by intervals of texture dimension w . Because w can be large (e.g.,
1,024 or even 4,096) the overall clustering is not tight at all.
Spatial locality can be greatly improved by replacing the raster-order mapping
of Equation 38.1 with tiled mapping. The texture image is logically divided into
smaller squares that tile the image, meaning that they cover the image with no
gaps and no overlaps. Tiled mapping is hierarchical: first raster order among tiles,
then raster order within the selected tile. Given a tile dimension of w t , the mapping
is specified as
a = a 0 + w t 2 ( x
w t ) +( x
w t )+ w
÷
w t ( y
÷
w t )+ w t ( y
w t ) ,
(38.2)
where
÷
indicates truncated integer division (e.g., 7
÷
4 = 1) and
indicates
modulo division yielding the remainder (e.g., 7
4 = 3). If the accessed texels
fall within a single tile (i.e., among all the address mappings, only the last two
terms of the equation differ) then spatial locality is improved by a factor of w
w t ,
because the small clusters of memory addresses are separated by w t (which multi-
plies y in the final expression in Equation 38.2) rather than w (which multiplies y
in Equation 38.1). Decreasing the tile size increases the magnitude of the improve-
ment: 8
/
2,048-texture image improves spatial locality
by a factor of 256! But smaller tiles also increase the likelihood that the cluster
of texels will straddle multiple tiles, which drives locality back down, potentially
below its raster-mapped value, if vertically adjacent tiles are straddled and the 2D
cluster of texels is small. Finding the best balance among such tradeoffs is central
to the art of system design. One clever solution increases the depth of the hierar-
chy by implementing tiles within tiles, or even tiles within tiles within tiles. Of
course, this approach runs into limits of complexity as the depth of the hierarchy
is increased, introducing yet another tradeoff.
We've just seen how the implementation of a GPU, through tiled texel map-
ping, can improve spatial locality. GPU architectures—that is, their program-
ming interfaces—can also be designed to allow improved locality of reference.
For example, the OpenGL programming interface couples each texture image
with its texture reconstruction-filtering mode, allowing the GPU driver to select
image-tiling parameters based on details such as linear versus cubic filtering. The
Direct3D interface allows a single texture image to be used with several texture
interpolation modes. This choice gives programmers more flexibility (they can use
a single texture image for multiple purposes that require different interpolations)
×
8-texel tiling of a 2,048
×
Search WWH ::




Custom Search