Game Development Reference
In-Depth Information
Figure 21.10. The Frame Profiler Results Page.
in Section 21.2.1. You can profile the frame by clicking the Profile Frame link in
the lower-left corner of the Frames Page.
Once you have selected the Frame Profiler, Parallel Nsight will run a series of
experiments on the captured frame, collecting bottleneck and utilization data-key
points in the GPU pipeline. Depending on the GPU architecture, this can take 40-
50 passes on the same frame, so be patient, and you will be rewarded. If you are
running on non-NVIDIA hardware, you will see a limited amount of information,
including primitive counts and draw call duration.
When the profiler is finished, it will show a number of lists on the top of the
screenalongwithsomegraphsbelow(see Figure21.10 ) .Ifyouselectthetop-most
entry on the top-left list, this will show you, in the list box to the right, all of the
draw calls in the frame along with some useful data. First, the draw calls will be
sorted by GPU time consumed. You can of course select other columns to sort by,
but if you are GPU bound, this is helpful to determine which draw call is the most
expensive (more on this in Section 21.4.3). The draw call list also shows CPU time
spent, primitives drawn, and pixels drawn. If you want to see all of the details
about a given draw call, click the link in the first column, and it will bring you
to the Draw Call Page discussed in Section 21.2.3. From there, you can check for
things like missing mipmaps and blending modes set improperly, etc. More on this
in Section 21.4.4. To get back to the Frame Profiler Results Page, simply select the
back arrow in the navigation bar.
Another important thing to look for in this list is draw calls that take a long
time and potentially render a lot of primitives but only modify a few pixels. This
means you are spending a lot of precious resources on something that does not
contribute much to the scene. You can fix this by using level of detail (LOD)
models, shaders, and/or resources to reduce the workload on the GPU for these
less important objects in the scene.
 
Search WWH ::




Custom Search