Database Reference
In-Depth Information
FFT
64
OS-split
OS-join
FFT
64
OS-split
OS-join
FFT
64
OS-split
OS-join
FFT
64
Figure 11.6
Parallel window split strategy with tree partitioning of degree
four.
The SQFs
FFTpart
and
FFTcombine
are FFT-specific window split and join
functions, respectively, and
RRpart
and
S-Merge
distribute complete windows
independent of the SQF to compute.
Complex stream partitioning schemes can be defined by combining dis-
tribution templates. For example Figure 11.6 shows a complex tree-shaped
distribution scheme containing two levels of window splits and window joins
for parallel computation of FFT streams specified by the template:
PCC(2,"OS-Split","FFTpart", "PCC",
{2,"OS-Split","FFTpart","FFT","FFTcombine"},
"OS-Join", "FFTcombine");
Figure 11.7 shows a performance comparison of executing the FFT com-
putation using different distribution templates with degree four in paral-
lelism. The experiments show that both window split and window distribute
90
80
WS4-Flat
WS4-Tree
WD4-Flat
WD4-Tree
70
60
50
40
30
20
10
0
256
2048
4096
8192
16384
Logical Window Size
Figure 11.7
FFT performance for parallelism of degree four with various
distribution templates.