Graphics Reference
In-Depth Information
Overlapped
non-empty
ray grid cells
Cell Idx Triangle Idx
Triangle batch
GS: Spawn PS hreads
VS: Transform
World-space
triangles
Stream
out
GS: Cons. Hull
Clipped triangle
# pixels = # rays
(Conservative)
Rasterization
Overlapped
non-empty
ray grid cells
Rasterization
PS: Voxelize
UAV
Cell Idx Triangle Idx
PS: Intersect single ray
# invokations = # rays
Figure 2.9. Intersection testing load balancing using the geometry shader.
the right number of vertex shader executions and thus process the right number
of cell-triangle pairs without having to do a CPU read back. In the geometry
shader of the second pass, a rectangle is generated for each pair. This dispatch
rectangle is scaled to make the number of covered viewport pixels equal to the
number of rays queued in the pair's ray grid cell. Consequently, for each ray ,a
fragment is generated and a pixel shader thread is spawned. We have to make
sure that the viewport we rasterize the dispatch rectancles into is large enough
to contain pixels for every ray in the fullest ray grid cell.
In each of the pixel shader threads, we now only have to perform intersection
testing with one single ray against one single triangle. This way, we achieve fully
concurrent intersection testing, independent of the number of rays queued per
ray grid cell.
Unfortunately, we have to provide auxiliary buffers, one (created with D3D11_
BIND_STREAM_OUTPUT ) to store the streamed out pre-transformed geometry and
another one (bound via counter-enhanced UAV) to store the intermediate cell-
triangle pairs for one batch of triangles. However, these buffers only need to
provide storage for the largest batch of triangles in the scene. The amount of
storage can be strictly limited by enforcing a maximum triangle batch size, e.g.,
by partitioning larger batches using multiple draw calls.
Search WWH ::




Custom Search