Graphics Reference
In-Depth Information
5.2 Algorithm
The algorithm is based on a division of the rendering area into smaller screen-
space tiles. Tile layouts are discussed in Section 5.3.1.
Once a layout has been defined, the general algorithm can be summarized in
the following steps (see Figure 5.2).
1. Project vertices on CPU to find the screen-space extent of each sprite quad.
2. For each quad, find intersecting tiles and store their ID in each affected cell.
3. (optional) Optimize grid by grouping compatible cells (see Section 5.4.1).
4. For each tile, compute required information to render aggregated sprites.
5. For each non empty tile, for each fragment use interpolated texture coordi-
nates to sample bound textures and blend results manually.
First, the sprites vertices are transformed on the CPU to figure out where
they will land on screen after projection. Once the screen-space positions are
known, we can compute, for each tile, the list of sprites affecting it. The lack
of compute shader on OpenGL ES forces us to make all those computations on
CPU.
For complex scenes containing lots of sprites to blend together, SIMD instruc-
tions (e.g., ARM NEON 1 Instruction Set) provide a good opportunity to reduce
the extra CPU overhead induced by the technique. Libraries 2 areavailabletoget
you started quickly.
After having computed the list of sprites affecting each tile, we can, for each
cell and for each sprite, compute the texture coordinate transform that will trans-
form the tile texture coordinates into those of each sprite it aggregates. Those
3
2 transform matrices (2D rotation + translation) will be passed later as uni-
forms to the vertex shader.
Finally, the render phase itself simply consists in rendering the tiles, one at a
time, using the interpolated texture coordinates to sample each texture (the same
texture can be bound to several samplers if desired) and to blend the intermediate
results manually, respecting the transparency order.
An optional optimization phase (step 3) can take place after building the per-
tile sprite lists and before computing the extrapolated texture coordinates; this
optimization consists in merging together cells that share the exact same sprites,
thus lowering the number of primitives sent to the GPU.
×
1 ARM is a registered trademark of ARM Limited (or its subsidiaries) in the EU and/or
elsewhere. NEON is a trademark of ARM Limited (or its subsidiaries) in the EU and/or
elsewhere. All rights reserved.
2 https://code.google.com/p/math-neon/
Search WWH ::




Custom Search