Game Development Reference
In-Depth Information
Geometry Unit. This unit is responsible for distributing the primitives to various
computational units. If this is the bottleneck, the only likely solution is to reduce
geometric complexity. Now, if you look at the pixel count, you might see, as stated
earlier, that you have a high primitive count with a low pixel count, which means
you are wasting all of that geometry and should look at your level of detail (LOD)
system to help.
Shader. This is the workhorse of the modern GPU. All of your shaders, vertex
through pixel, get processed by this unit. Even though this is true, if the Shader
Unit is your bottleneck, you can still potentially see some improvements by rebal-
ancing where you perform work. Ask yourself: Do all of the operations in your
pixel shader have to be computed per pixel? Can this work be done per vertex
and interpolated across the triangle face? Can the calculation be prebaked into a
texture and made into a simple sampling call? Back to LOD: is the contribution of
this shader so small or unlikely to be appreciated because of distance to the camera
that you can actually run a simpler shader? All of these techniques still apply even
though the units are shared because it can reduce the number of times you run the
shader since they are all just threads running in the Shader Unit.
Tex ture . The Texture Unit can feed the Shader Unit no matter what type of shader
it is running. Textures aren't just for pixel shaders, but can be displacement maps
for vertex or geometry shaders. If the bottleneck information reads high for the
Texture Unit, the main things to look at are compression, mipmaps, and sampling
setup. There can be quality concerns, but most diffuse textures can be compressed
with no loss of image fidelity. Normal maps and other textures that contain math-
ematical data can be a little more susceptible to artifacts, but experimentation is
the only way to determine if it will be a concern. If possible, make sure you are
using mipmaps to help with memory access locality. Click on one of the draw calls
to inspect the input resources. You can also see if the appropriate sampling mode
(point, bilinear, anisotropic, etc.) is being used, and be judicious with your choice
of sampling. Bilinear and trilinear can handle many cases just fine. Anisotropic
is expensive and will show no improvement for rendering that is always screen
aligned. Finally, if you have shader cycles to spare (look at the utilization graph
for the Shader Unit), you can consider moving a precalculated texture (like noise
or something similar) into the shader, saving some of the texture work
Raster operations. This unit is also called Output Merger in the DirectX spec-
ifications. This is where the read/modify/write operation is done when blending
is enabled, but can also have work when performing a lot of z-comparisons or con-
verting the output of your shader into a different format. The best optimization
strategy for the Raster Operations unit is to limit what it has to do. Can you
disable blending? Can you disable it based on level of detail? Of course, making
sure that you render your opaque geometry front to back will help as well and will
Search WWH ::




Custom Search