Graphics Reference
In-Depth Information
θ
3
4
1
θ
12
6
...
57
θ:
θθθ
1:
2:
3:
4:
5:
6:
7:
8:
9:
9
1
1
θ
1
θ
1
θ
Ray marching
Ray grid with queued rays
Ray links
Figure 2.7. Ray grid construction by ray marching, adding ray links to traversed cells.
but rather the distance to the next cell has to be recalculated in every marching
step. This is easily achieved by maintaining a vector of the three offsets to the
next cell on every axis, always choosing the smallest of these as the next step
offset. Thus, we ensure that no ray-cell intersection is missed.
Storage options. We can queue the rays into ray grid cells in several ways; the
most straightforward solution is to construct linked lists. Every time a new ray is
enqueued, a volumetric list head texture is atomically updated with the link to a
new head element. HLSL's InterlockedExchange() also yields the previous head
link, which can be stored in the new element to establish a singly linked list.
However, we have found that in our case a more compact and coherent lay-
out is desirable: Once we enter the intersection testing stage, primitives will be
intersected with all rays in each overlapped ray grid cell. For optimal memory
access patterns, the links to rays enqueued in the same cell should lie next to each
other in memory. As primitives typically come in connected batches of triangles,
we can furthermore expect that a lot of them occupy similar sets of neighboring
cells. To make good use of caches, the rays of neighboring cells should also lie
next to each other in memory.
Storing the rays enqueued in the ray grid consecutively along a space-filling Z
curve fulfills these requirements (see Figure 2.7). Such a Z layout can be directly
obtained by applying a 3D parallel prefix sum scan to a grid of cell ray counts.
However, for decently sized grids, this approach would have to work through too
many empty cells to be ecient.
Construction using sorting. Recent research has shown that it is feasible to use
sorting for grid construction [Kalojanov and Slusallek 09]. 1 Using the opti-
mized GPU sorting implementation provided by the B40C library [Merrill and
Grimshaw 11], this turns out to be both simple and ecient.
1 Also see [Kalojanov et al. 11] for multi-level grids.
Search WWH ::




Custom Search