Hardware Reference
In-Depth Information
FIGURE 4.15 Floor plan of the Fermi GTX 480 GPU . This diagram shows 16 multithreaded
SIMD Processors. The Thread Block Scheduler is highlighted on the left. The GTX 480 has 6
GDDR5 ports, each 64 bits wide, supporting up to 6 GB of capacity. The Host Interface is PCI
Express 2.0 × 16. Giga Thread is the name of the scheduler that distributes thread blocks to
Multiprocessors, each of which has its own SIMD Thread Scheduler.
Dropping down one more level of detail, the machine object that the hardware creates, man-
ages, schedules, and executes is a thread of SIMD instructions . It is a traditional thread that
contains exclusively SIMD instructions. These threads of SIMD instructions have their own
PCs and they run on a multithreaded SIMD Processor. The SIMD Thread Scheduler includes a
scoreboard that lets it know which threads of SIMD instructions are ready to run, and then it
sends them of to a dispatch unit to be run on the multithreaded SIMD Processor. It is identic-
al to a hardware thread scheduler in a traditional multithreaded processor (see Chapter 3 ),
just that it is scheduling threads of SIMD instructions. Thus, GPU hardware has two levels
of hardware schedulers: (1) the Thread Block Scheduler that assigns Thread Blocks (bodies of
vectorized loops) to multithreaded SIMD Processors, which ensures that thread blocks are as-
signed to the processors whose local memories have the corresponding data, and (2) the SIMD
Thread Scheduler within a SIMD Processor, which schedules when threads of SIMD instruc-
tions should run.
The SIMD instructions of these threads are 32 wide, so each thread of SIMD instructions in
this example would compute 32 of the elements of the computation. In this example, Thread
Blocks would contain 512/32 = 16 SIMD threads (see Figure 4.13 ).
 
Search WWH ::




Custom Search