Civil Engineering Reference
In-Depth Information
Table 7.8 Scalability of test of 1 to 12 processors
No. of processors
CPU time
Speed-up
Efficiency
1
238.8
1.000
100.00
2
153
1.561
78.04
4
76.43
3.124
78.11
6
54.09
4.415
73.58
8
39.57
6.035
75.44
10
32.25
7.405
74.05
12
27.12
8.805
73.38
From Figure 7.28, the CPU time required decreases rapidly as more processors are used
in the parallel insertion, and the speed-up with the number of processors employed is quite
linear with a 5% difference in efficiency from 78% to 73% for 2 to 12 processors. Although
good scalability is observed for parallel zonal insertion, in 3D, the performance of parallel
zonal insertion is sensitive to the number of subdivisions, as in higher dimensions, the ratio
of boundary to volume increases and hence the amount of work in dealing with redundant
simplices increases with the dimensionality. This is only a scalability test in which the zonal
subdivision may not be the most efficient for the number of processors employed. Scalability
test for more than 12 processors has not yet been done, as the algorithm has to run on a
cluster machine with a parallel environment quite different from the OpenMP system with
shared memory.
Finally, processors can run at full capacity only when the number of zones is an integral
multiple of the number of processors used and the number of zones is kept to a minimum.
This trivial fact was verified in the insertion of 20 million points with a zonal division of 2 ×
2 × 2 = 8 zones for 2, 4 and 8 processors, 2 × 2 × 3 = 12 zones for 6 and 12 processors and
2 × 2 × 5 = 20 zones for 10 processors. The results are shown in Table 7.9, and a plot of CPU
time, speed-up and efficiency is depicted in Figure 7.29. It can be seen that parallel insertion
at high efficiency of 90% or more was achieved using 2, 4, 6, 8 or 12 processors, whereas
the efficiency dropped to 84% when 10 processors were used. A slightly lower efficiency in
the performance of using 10 processors is ascribed to a more complicated zonal subdivision
of 2 × 2 × 5 = 20 zones, whereas only 2 × 2 × 2 = 8 or 2 × 2 × 3 = 12 zones were used in the
other cases. The efficiency of using six processors dropped drastically from 96% to only
300
250
CPU time
200
Speed-up
150
E ciency
100
50
0
1
2
4
6 8 0
12
Number of processors, NP
Figure 7.28 Insertion of 20 million points.
 
Search WWH ::




Custom Search