Graphics Reference
In-Depth Information
This implementation could be improved further to support
sampling four texels per thread, although fewer threads would be
possible due to the maximum thread group shared memory size.
The following figure shows the group-shared memory values during the convolution of a
horizontal 3-tap box blur filter and image using this recipe:
Convolution of horizontal 3-tap filter and image in progress showing the shared memory of two thread groups.
Out-of-bound samples have been clamped.
Looking at the full 3x3 kernel in the Convolution filter kernel (center) applied to input image
figure from earlier, it is apparent that the edge cases require some additional checking,
that is, if the kernel is over the top-left pixel, the top and right side of the kernel will be
outside the image bounds. This can be ignored but will result in some artifacts around
borders—as an example, the box blur filter will gain a dark border around the right and
bottom edges. Alternatively, we can clamp to the bounds of the input image, lending additional
weight to border pixels (as we have done in this recipe, see the previous figure), or start the
filter within the bounds of the image and produce a slightly smaller image on output. Clamping
has the added benefit that we do not need to worry if the thread count does not exactly match
the texel count, if there are a few extra it will not impact the result.
To allow us to accumulate multiple filters, we have created an additional texture. As the target
textures must now also be used as input, we have enabled the shader resource binding flag.
The example output of the Implementing a Gaussian blur filter recipe includes comparisons of
filters that have been applied multiple times. With this ping-pong approach to textures, we can
combine any number of filters together.
 
Search WWH ::




Custom Search