Graphics Reference
In-Depth Information
extended in 2001. Since that time support for shaders has become a defining fea-
ture of GPUs.
Despite the delay in exposing programmability to application programmers,
from the start essentially all implementations of OpenGL and Direct3D were pro-
grammed by their developers. For example, the programmable logic array (PLA)
that specified the behavior of the Geometry Engine, which formed the core of
Silicon Graphics' graphics business in the early '80s, was programmed prior to
circuit fabrication using a Stanford-developed microcode assembler. Subsequent
Silicon Graphics GPUs included microcoded compute engines, which could be
reprogrammed after delivery as a software update. But application programmers
were not allowed to specify these program changes. Instead, they were limited
to specifying modes that determined GPU operation within the constraints of the
seemingly “hardwired” pipeline architecture.
There are several reasons that the programmability of mainstream GPUs
remained hidden behind modal architectural interfaces throughout the '80s and
'90s. Directly exposing implementation programming models, which differed
substantially from one GPU to another, would have forced applications to be
recoded as technology evolved. This would have broken forward compatibility,
a key tenet of any computing architecture. (Programmers reasonably expect their
code to run without change, albeit faster, on future systems.) Today's GPU drivers
solve this problem by cross-compiling high-level, implementation- independent
shaders into low-level, implementation- dependent microcode. But these software
technologies were just maturing during the '80s and '90s; they would have exe-
cuted too slowly on CPUs of that era.
These reasons and others can be summarized in the context of exponential
performance and complexity advances.
Need: Increasing GPU performance created the need for application-
specified programmability, both because more complex operations could
be supported at interactive rates, and because modal specification of these
operations became prohibitively complex.
GPU hardware capability: High-performance GPUs of the '80s and
'90s were implemented using multiple components sourced from vari-
ous vendors. But increased transistor count allows modern GPUs to be
implemented as single integrated circuits. Single-chip implementations
give implementors much more control, simplifying cross-compilation from
a high-level language to the GPU microcode by minimizing implementa-
tion differences from one product generation to the next.
CPU hardware capability: Increasing CPU performance allowed GPU
driver software to perform the (simplified) cross-compilation from high-
level language to hardware microcode with adequate performance (both of
the compilation and of the resultant microcode).
38.6 Texture, Memory, and Latency
Thus far our discussions of parallelism and programmability have treated the
graphics pipeline as a stream processor. Such a processor operates on individ-
ual, predefined, and (typically) small data units, such as the vertices, primitives,
and fragments of the graphics pipeline, without accessing data from a larger exter-
nal memory. But the graphics pipeline is in reality not so limited. Instead, as can
 
 
Search WWH ::




Custom Search