Graphics Reference
In-Depth Information
// Loop over s li ces in f i l t e r
for ( int zz = FILTER_D 1; zz > =0; zz −− )
{ // Copy current filter coefficients to constant memory
CopyFilterCoefficients ( zz );
// Perform 2D convolution and
// accumulate the filter responses inside the kernel ,
// launch kernel for several slices simultaneously
Convolution_2D_Shared <<< dG , dB >>> ( d_FR );
}
Listing 5.5. Host code for non-separable 3D convolution, by performing non-separable
2D convolution on the GPU and accumulating the filter responses inside the kernel ( dG
stands for dimGrid , dB stands for dimBlock ,and FR stands for filter responses). Just as
for the non-separable 4D convolution, the 2D convolution is launched for all slices at
the same time to increase the occupancy.
the complete code is available in the repository. The only difference compared to
4D convolution is that the CPU for 4D also loops over time points (for data and
filters). Just as for 4D convolution, the 2D convolution is launched for all slices
at the same time to increase the occupancy.
5.9 Performance
We will now list some performance measures for our implementations. All the
testing has been done with an Nvidia GTX 680 graphics card with 4 GB of
memory.
5.9.1 Performance, 2D Filtering
Performance estimates for non-separable 2D filtering are given in Figures 5.6-5.7.
Time for transferring the data to and from the GPU is not included. The first
plot is for a fixed image size of 2048
×
2048 pixels and filter sizes ranging from
3 × 3to17 × 17. The second and third plots are for fixed filter sizes of 9 × 9and
17 × 17, respectively. The image sizes for these plots range from 128 × 128 to
4096
4096 in steps of 128 pixels. All plots contain the processing time for spatial
convolution using texture memory (with and without loop unrolling), spatial
convolution using shared memory (with and without loop unrolling), and FFT-
based filtering using the CUFFT library (involving two forward FFT's, complex
valued multiplication and one inverse FFT).
×
Search WWH ::




Custom Search