Per-Pixel Lists for Single Pass A-Buffer - GPU Pro: Advanced Rendering Techniques

Graphics Reference

In-Depth Information

1.5 Memory Management

All four techniques use the main buffer for storing fragments. We discuss in

Section 1.5.1 how to initialize the buffer at each new frame. All implementations

assumed so far that the buffer is large enough to hold all incoming fragments.

This may not be true depending on the selected viewpoint, and we therefore

discuss how to manage memory and deal with an overflow of the main buffer in

Section 1.5.2.

1.5.1 The Clear Pass

With the Lin-alloc strategy, the beginning of the main buffer that stores the

heads of the lists has to be zeroed. This is implemented by rasterizing a fullscreen

quad. The global counter for cell allocation has to be initially set to gScreenSize .

In addition, when using the paged allocation scheme with the Pre-Lin method,

an additional array containing for each pixel the free cell index in its last page

has to be cleared as well.

With the Open-alloc strategy the entire main buffer has to be cleared: the

correctness of the insertion algorithm relies on reading a zero value to recognize

a free cell. The array A used to store the per-pixel maximal age has to be cleared

as well.

Figure 1.3 shows a breakout of the timings of each pass. As can be seen, the

Clear pass is only visible for the Open-alloc techniques, but remains a small

percentage of the overall frame time.

1.5.2 Buffer Overflow

None of the techniques we have discussed require us to count the number of

fragments before the Build pass. Therefore, it is possible for the main buffer

to overflow when too many fragments are inserted within a frame. Our current

strategy is to detect overflow during frame rendering, so that rendering can be

interrupted. When the interruption is detected by the host application, the main

buffer size is increased, following a typical size-doubling strategy, and the frame

rendering is restarted from scratch.

When using linked lists we conveniently detect an overflow by testing if the

global allocation counter exceeds the size of the main buffer. In such a case, the

fragment shader discards all subsequent fragments.

The use of open addressing requires a slightly different strategy. We similarly

keep track of the number of inserted fragments by incrementing a global counter.

We increment this counter at the end of the insertion loop, which largely hides

the cost of the atomic increment. With open addressing, the cost of the insertion

grows very fast as the load-factor of the main buffer nears one (Figure 1.4). For

this reason, we interrupt the Build pass when the load-factor gets higher than

10 / 16.

Search WWH ::

Custom Search

Home