Quadtrees on the GPU - GPU Pro: Advanced Rendering Techniques

Graphics Reference

In-Depth Information

3

VI

Quadtrees on the GPU

Jonathan Dupuy, Jean-Claude Iehl,

and Pierre Poulin

3.1 Introduction

Finding an appropriate mathematical representation for objects is a fundamental

problem in computer graphics. Because they benefit from hardware acceleration,

polygon meshes provide an appealing solution and are widely adopted by both in-

teractive and high-quality renderers. But in order to maintain optimal rendering

performance, special care must be taken to guarantee that each polygon projects

into more than a few pixels. Below this limit, the Z-buffer starts aliasing, and

the rasterizer's eciency decreases drastically [AMD 10]. In the movie industry,

this limitation is alleviated by using very high sampling rates, at the expense of

dissuasive computation times. For applications that target interactivity, however,

such as video games or simulators, such overhead is unaffordable.

Another solution is to adapt the mesh to produce optimal on-screen polygons.

If it can be simplified and/or enriched on the fly, it is said to be a scalable

representation. Quadtrees provide such a scheme for grids and have been used

extensively since the early days of rendering, especially for terrain synthesis.

Despite their wide adoption, though, the first full GPU implementation was only

introduced last year [Mistal 13]. This is due to two main reasons. First, quadtrees

are usually implemented recursively, which makes them hard to handle in parallel

by their very nature (see the implementation of Strugar [Strugar 09], for instance).

Second, they require a T-junction removal system, which may result in an increase

of draw calls [Andersson 07], adding non-negligible overhead to the CPU. As an

alternative to quadtrees, instanced subdivision patches have been investigated

[Cantlay 11,Fernandes and Oliveira 12]. Although such solutions can run entirely

on the GPU, their scalability is currently too limited by the hardware.

In this chapter, we present a new implementation suitable for the GPU, which

completely relieves the CPU without sacrificing scalability. In order to update the

quadtree in parallel, we use the linear quadtree representation, which is presented

Search WWH ::

Custom Search

Home