Graphics Reference
In-Depth Information
Layout of group-shared memory for horizontal and vertical filters for a 5-tap filter with varying thread group
dimensions. The numbers represent which thread in the group loads the texel for that memory location where top-left
is the 0th index. The outer cells indicate additional texels loaded from outside the bounds of the thread group.
If you recall the compute shader thread addressing figure in the explanation of the Running
a compute shader - desaturation (grayscale) recipe, each thread group consists of up to a
maximum of 1024 threads in Shader Model 5. These threads are given an ID based upon
the number of groups dispatched and the dimensions within the numthreads attribute
of the compute shader. Instead of a 32x32x1 or a 16x4x1 thread group, as we have
used in the previous image processing recipes, in this recipe we have defined a thread
group size to be 1024 threads wide (X-dimension) and one thread high (Y-dimension) for
our BlurFilterHorizontalCS shader, and one thread wide and 1024 high for the
BlurFilterVerticalCS shader. This means that for an ideal image size of 1024x1024,
we sample (1024*1024*2) + (1024*2*radius) + (1024*2*radius) texels—with a filter radius
of three (5-tap filter). This equates to 2.0117 samples per texel (2,109,440 total, a far cry
from the approximately 6.29 million samples needed for a pixel shader implementation).
Of course, this is the best case scenario; for a 1920x1080 image, the average samples per
texel is 2.0119. This is due to the overlap of 2*radius between thread groups (that is,
we need two horizontal and vertical groups to cover the width/height respectively—although
this could be alleviated by changing the thread dimensions). The previous figure shows
the shared memory layout for the horizontal and vertical filters for a thread group with the
dimensions 1024x1 and 1x1024, respectively, as well as a horizontal and vertical 32x32
thread group size to demonstrate the capabilities of the filter for dealing with 2D shared
memory layouts.
 
Search WWH ::




Custom Search