Civil Engineering Reference
In-Depth Information
300
250
CPU time
200
Speed-up
150
E ciency
100
50
0
1
2
4
6
8
10
12
Number of processors, NP
Figure 7.29 Insertion with optimal zonal divisions.
65% for parallel insertion with a zonal division of 2 × 2 × 2 = 8 zones, as shown in Table 7.9;
as 8 is not divisible by 6, not all the processors were working at the full capacity throughout
the insertion process. From these tests, it can be seen that for parallel insertion over three or
higher dimensions, it is better to use the minimum zonal subdivision, which is a multiple of
the number of processors, to attain the best possible performance.
The 2 × 2 × 2 zonal parallel insertion has been applied to three non-uniform point distribu-
tions on a PC in which 5 million points were randomly generated super-imposed with another
5 million points generated, respectively, along the diagonals, over the surface of a sphere and
on a spiral curve (within 1% of the data spread), as shown in Figures 7.30-7.32 for the first
2000 points. These point distributions are designed to simulate the realistic case of adaptive
refinement meshing, local concentration in solid and fluid mechanics problems. The CPU time
in seconds of single-processor insertion and multi-processor is shown in Table 7.10.
For uniform random points, the speed-up is 4.3 times, which is reduced to approximately
3.5 times for the non-uniform point distributions, showing that the parallel insertion pro-
cess is not very sensitive to the point distributions as long as loading is fairly balanced. It
is interesting to note that the CPU time taken for point insertion of the spherical surface
distribution is slightly less than that of the uniform distribution. This may be due to the fact
that the average number of points within a cell is more optimised for the spherical distribu-
tion tested, which is locally similar to a uniform distribution. The CPU time quoted in Table
7.10 is faster than those given in Table 7.7 for the insertion of the same number of points. It
is because the non-uniform distribution tests were done using a compiled Fortran program
Table 7.9 Parallel insertion with optimal zonal divisions
NP Zones CPU SU Eff
1 1 238.8 1.000 100.00
2 2 x 2 x 2 122.4 1.951 97.55
4 2 x 2 x 2 60.96 3.917 97.93
6 2 x 2 x 3 41.32 5.779 96.32
8 2 x 2 x 2 31.42 7.600 95.00
10 2 x 2 x 5 28.43 8.40 84.00
12 2 x 2 x 3 22.08 10.82 90.13
6 2 x 2 x 2 60.95 3.92 65.30
Note: CPU = CPU time in seconds, SU = speed-up, Eff = efficiency.
 
Search WWH ::




Custom Search