Information Technology Reference
In-Depth Information
22
if (master) {
23
times.append(t*rate/NITERS);
24
}
25
}
26
level.g, level.L = leastSquares(times)
27
return (level.g, level.L)
28
}
Algorithm 1.2. coreBenchmark function.
Then a synchronization for the current level is performed (line 5) in order to
assure that all threads have the computing rate value.
The coreBenchmark function measures a full h -communication, which we de-
fine as the extension of a h -relation for the shared-memory case within a single
node. It is implemented as a communication where every core writes/reads ex-
actly h data words. We consider the worst case, measuring the slowest communi-
cation possible by cyclically reading single data words into other processors. In
that way, the values of g i and L i computed using the benchmark are pessimistic
values, and the real values will be always better. The variable h represents the
largest number of words read or written in the shared memory of the level. HMAX
is the maximum value for all h parameters used in the communications patterns
for each level. It may need to be different for different levels of the hierarchy, we
plan to find suitable values by trial and error.
The communication times using the h -communication pattern are initialized
by the initCommunicationPattern routine (line 7). This process is repeated
NITERS times (lines 10-13), because each operation is too fast to be measured
with proper precision. After that, the master thread in each level saves the flops
used for each h -communication (line 16).
Finally, the parameters g and L are computed using a traditional least squares
approximation method (line 19), to fit the data to a linear model, according to
the related works [1,6], providing an accurate approximation for g i and L i .
3.3 Methodology for the Empirical Evaluation of h -Communications
The methodology applied to measure the h -communications and then estimate
the parameters g and L is based on measuring the implementation of MultiBSP
operations . We refer to MultiBSP operations as the functions/procedures need to
implement an algorithm designed with the MultiBSP computational model. In
our software design, the MBSP operations module contains the implementation
of these functions, including operations provided by the MulticoreBSP for C
library [9]. This library establishes a methodology for programming according
to the MultiBSP computational model.
The software design shown in Fig. 3 is important here because when MultiBSP
algorithms are programmed using other libraries, it is possible to reconfigure the
tool, changing the MBSP operation module and re-characterizing the architec-
ture by running the benchmark with this new configuration.
 
Search WWH ::




Custom Search