Graphics Reference
In-Depth Information
12000
Texture
Texture unrolled
10000
Shared
Shared unrolled
FFT
8000
6000
4000
2000
0
2
4
6
8
10
12
14
16
18
Filter Size
Figure 5.6. Performance, measured in megapixels per second, for the different imple-
mentations of 2D filtering, for an image of size 2048 × 2048 and filter sizes ranging from
3 × 3to17 × 17. The processing time for FFT-based filtering is independent of the filter
size and is the fastest approach for non-separable filters larger than 17 × 17.
5.9.2 Performance, 3D Filtering
Performance estimates for non-separable 3D filtering are given in Figures 5.8-
5.9. Again, time for transferring the data to and from the GPU is not included.
The first plot is for a fixed volume size of 256
×
256
×
256 voxels and filter sizes
ranging from 3
×
3
×
3to17
×
17
×
17. The second and third plots are for fixed
filter sizes of 7
×
7
×
7and13
×
13
×
13, respectively. The volume sizes for
these plots range from 64
512 in steps of 32 voxels. All
plots contain the processing time for spatial convolution using texture memory
(with and without loop unrolling), spatial convolution using shared memory (with
and without loop unrolling), and FFT-based filtering using the CUFFT library
(involving two forward FFT's, complex-valued multiplication, and one inverse
FFT).
×
64
×
64 to 512
×
512
×
5.9.3 Performance, 4D Filtering
Performance estimates for non-separable 4D filtering are given in Figures 5.10-
5.11. Again, time for transferring the data to and from the GPU is not included.
The first plot is for a fixed data size of 128
×
128
×
128
×
32 elements and filter
sizes ranging from 3
×
3
×
3
×
3to17
×
17
×
17
×
17. The second and third plots
Search WWH ::




Custom Search