Information Technology Reference
In-Depth Information
7.1 Loop Partitioning
Figure 7 shows the behavior of the three parallel models (LPMC, LPMP and
LPMC+LPMP.) The x axis shows the variations of either np or nt ;notethat
only the hybrid model shows the variation of np when nt =2.The y axis shows
the variation of speedup S as the number of Processing Units ( np or nt )increases.
8
7
Ideal 2
6
LPMC+LPMP(nt=2)
5
4
Ideal 1
3
LPMP
LPMC
2
1
1
2
3
4
Processing Units (nt & np)
Fig. 7. Performance comparison among parallel models as np ( nt )increases
For this implementation, according to Amdahl's law, a linear behavior on
speedup is expected in every model since the parallel section covers more of the
91.3% of the coding. For example, the LPMP model is closer to the ideal behav-
ior when 1
np < 4. Nevertheless, there is an inflection point in the linearity
for the LPMC model when nt > 2 (see Fig. 7). This indeed is caused by the
performance of the logical cores 17 when the available physical cores are oversub-
scribed. Only when nt = 2, the speedup for the LPMC model gets closer to that
for the LPMP model but with a downtick: the LPMP model got S =1 . 97 X ap-
proximately whereas the LPMC model got S =1 . 90 X (see Table 1) using either
two computer nodes or two cores respectively. The relative error of the speedup
between the Ideal and LPMC models increases as the number of cores increase
and consequently the parallel eciency decreases. The hybrid model (LPMP +
LPMC) gets the highest performance since every core contributes to the total
eciency of the processing.
17 Hyper-threading technology on Xeon processors.
Search WWH ::




Custom Search