Graphics Reference
In-Depth Information
crucial. Faced with introducing an uncompetitive GPU, Intel chose instead to can-
cel Larrabee.
38.9 GPUs as Compute Engines
As we've now seen, modern GPUs such as the GeForce 9800 GTX utilize massive
parallelism, exposed as a pipeline of fixed-function and application-programmable
stages, to apply hundreds of GFLOPS to the rendering of 3D graphics. Because
the peak performance of GPUs is so much higher than that of CPUs (see Sec-
tion 38.2), and because the SPMD architecture of GPU-programmable stages
makes exploiting that performance straightforward (see Section 38.5), program-
mers are highly motivated to speed up their nongraphical programs by porting
them from CPUs to GPUs. We conclude this chapter with a short discussion of
these efforts.
Creative programmers have probably been porting nongraphical algorithms
to GPUs since they came into existence. Under the rubric of GPGPU (general-
purpose computing on GPUs), these efforts became an important trend in the
late '90s. The primary enabler of this trend was the availability of programmable
shaders in ubiquitous, single-chip GPUs such as NVIDIA's predecessors to the
GeForce 9800 GTX.
Algorithms with substantial data parallelism (see Section 38.4) were ported
to GPUs by implementing their kernels as shaders. The kernel of an image-
processing filter, for example, computes the value of a single output pixel as the
weighted sum of the values of nearby pixels. To run the shaders (i.e., to execute
the algorithm) initial data was loaded as a 2D texture and a 2D rectangle fitted to
the texture was rendered, causing the results to be deposited into the framebuffer.
Over time, researchers identified data-parallel representations for algorithms, such
as sorting, that aren't typically thought of as data-parallel. This allowed GPGPU
to apply to a broader range of problems.
As GPGPU became more prevalent, new architectures were developed to bet-
ter expose the general-purpose computing capabilities of GPUs. Examples include
OpenCL, Microsoft's Direct Compute, and NVIDIA's CUDA. These architec-
tures all maintain the SPMD programming model (i.e., the shaders) of the tra-
ditional pipeline architectures, implemented using the same multithreaded SIMD
cores. All, however, dispense with the graphics pipeline and much of its fixed-
function implementation, exposing instead a single compute stage. Compute-
appropriate mechanisms, such as execution commands (it is no longer necessary to
rasterize a rectangle to execute the shader-implemented kernels) and explicit local
memory (as described in Section 38.7.2), are also added. The general-purpose
computing architectures are alternatives to, not replacements for, OpenGL and
Direct3D. Some in fact support interoperation, allowing a single GPU to compute
and display data without transferring intermediate data from GPU to CPU and
back.
GPGPU has come a long way during its decade of development. Today the
fastest supercomputers in the world use GPUs as their primary computing engines,
as do applications ranging from hedge-fund management to quantum physics.
GPUs will never replace CPUs, but it seems increasingly likely that a new comput-
ing architecture, derived from GPU and CPU technology and perhaps resembling
Intel's Larrabee prototype, will define the future of computer architecture.
 
 
Search WWH ::




Custom Search