Database Reference
In-Depth Information
network is used with a switch in the centre node, which is the actual setup
we used for our empirical evaluation in Sect. 5 . In this case a multicast can be
used which transmits the training data from the original node only once to the
switch, which then multiplies the data and distributes them to each computing
node on a separate wire. Hence, in this case, we can ignore the multiplication of
T comdat,i with
T asm,i remains a computational bottleneck,
which increases with the number of base classifiers. However, its computational
requirements are relatively low even for large numbers of base classifiers and
is not expected to have a large impact on
p
as in this case
p
=1.
T total (
p
). Nevertheless, it would be
possible to parallelise
T asm,i , at least to a certain extent, by using multiple
Reducers executed on different cluster nodes. One Reducer per two Mappers
could combine the rule sets of these two mappers. The Reducers' outputs (again
rules sets) could then be combined similarly using further Reducers executed on
different cluster nodes. This may be beneficial for very large numbers of base
classifiers. The speed-up factor is a standard metric to evaluate the scalability of
parallel algorithms with respect to the number of computing nodes or processors
p
being used [ 13 , 15 ]. It shows how much a parallel version of an algorithm is
faster compared with its single processor version. The generic formula for the
speed-up
S
(
p
)is:
)= runtime T on
processor
runtime T on p processors
1
S
(
p
For Parallel Random Prism the numerator of
S
(
p
) can be substituted by
T total (1)
and the denominator of
S
(
p
) can be substituted by
T total (
p
). Thus the speed-up
for Parallel Random Prism can be described by:
b
b
b
b
T comdat +
T sam,i +
T cla,i +
T comres,i +
T asm,i
S ( p )= T total (1)
T total ( p )
i =1
i =1
i =1
i =1
=
b
b
b
b
T sam,i
T cla,i
T comres,i
T asm,i
i =1
i =1
i =1
i =1
T comdat · p +
+
+
+
p
p
p
r
Again, what can be seen is that the only limiting factors are
T comdat · p
and
i =1 T asm,i
in the denominator of
S
(
p
) as they are not parallelised. Yet, the
time needed to execute
is neglectably small compared
with the parallelised portions of Parallel Random Prism. Thus we can assume
that the
T asm,i
and
T comdat · p
. For example, if
running Parallel Random Prism consumes 10000 ms on one node, then for using
4 nodes we would expect the runtime to be 2500 ms (4 times faster assuming the
ideal case), hence
S
(
p
) will be close to the ideal case, which is
S
(
p
)=
p
(4) = 10000ms
2500ms
S
=4.
) above could also be used to determine the theoretical
maximum speed-up, through building the derivative
The formula for
S
(
p
S (
), and calculating its
x-axis intercepts and then determining subsequently its global maxima. However,
we refrain from this step.
Next Sect. 5 will provide an empirical analysis of Parallel Random Prism
supporting the theoretical analysis presented in this section.
p
 
Search WWH ::




Custom Search