Tiled Deferred Blending - GPU Pro: Advanced Rendering Techniques - page 308

Graphics Reference

In-Depth Information

Single Draw Call

Multiple Draw Calls

Draw calls

1

1 per tile

VS data

dynamic VBO

static VBO

+ uniforms

Texcoord transforms

on CPU

on GPU

Number of textures

1(upto8forwholegrid) 8 per tile

Unused layers

dynamic branching

different programs

Interpolators usage

high

low to high depending on aggregated

layer count

Tab l e 5. 1. Single versus multiple draw calls summary.

5.4.3 Stencil

Stencil test is another common optimization for effects that affect only portions

of the screen [Weber and Quayle 11].

5.5 Results

The savings achieved with this technique depend greatly on the scene complex-

ity, the tile layout and, first and foremost, the underlying architecture. We can,

however, estimate a best case scenario by comparing the cost of rendering sev-

eral fullscreen quads using regular blending versus rendering the same amount

of fullscreen quads using deferred tile blending. The results are presented in

Figure 5.6 and in Table 5.2.

We can observe a linear relationship between the number of fullscreen layers

to blend and the time it takes to render the frame. As expected, we did not

notice any improvement on TBR architectures and the technique was even a bit

slower than simple blending on some devices. However, on IMR GPUs such as the

Tegra 3 equipping the Nexus 7, rendering time was approximately 35% shorter

than without using tile-based blending.

Fullscreen

layer count

SB

(Nexus7)

TDB

(Nexus7)

SB

(iPhone4S)

TDB

(iPhone4S)

8

23

16

7

8

16

45

30

13

14

24

66

44

19

21

32

87.5

58

24

28

40

109

72

32

34

Tab l e 5. 2. Tiled deferred blending (TDB) and simple blending (SB) rendering times

(in ms).

Next Page

GPU Pro: Advanced Rendering Techniques

Search WWH ::

Custom Search

Home