Database Reference
In-Depth Information
Figure 9 shows results for an initial task split of 72 tasks, or three per worker.
Again, MARLA splits tasks into 8 subtasks at each node. Figure 9 shows that
some upgrades result in performance degradation. In particular, configurations
<
>
underperform surrounding data points. In this case, Faster nodes request addi-
tional work that they cannot complete to improve turnaround time, because
requests arrive after the new Fastest nodes have started executing additional
tasks. The new tasks on Faster nodes then increase the turnaround time as the
framework waits for them to finish. In other configurations, the Fastest nodes can
complete these tasks because they constitute a higher percentage of the cluster
and are able to get to these tasks before the Faster nodes can.
Comparing Figs. 8 and 9 shows that a split granularity of 72 tasks instead
of 24 enables MARLA to adapt to cluster upgrades more eciently. The differ-
ence in performance between these two figures illustrates that with a finer task
granularity, upgrades to fewer nodes can still lead to faster execution times.
(52
,
1
.
0)
,
(32
,
1
.
075)
,
(16
,
8
.
010)
>
and
<
(20
,
1
.
0)
,
(64
,
1
.
075)
,
(16
,
8
.
010)
Fig. 9. This contour plot shows the effects of varying two kinds of nodes within a
cluster with respect to computation time. In this case, the effect of 72 tasks in a
24 node cluster that assumes 8 sub-tasks for each task. The X-axis shows the percentage
of the cluster that has been upgraded to Faster nodes, while the Y-axis shows the
percentage of the cluster that has been upgraded to Fastest nodes. Impossible points
have been interpolated. The solid lines indicate the trends in the data.
We also consider configurations where MARLA divides tasks into 32 sub-
tasks. Figure 10 indicate that when too few tasks exist, Baseline nodes incur
the overhead of 32 subtasks on a 4 core machine. This effect appears in the
time difference at configurations
<
(19
,
1
.
0)
,
(65
,
1
.
075)
,
(16
,
8
.
010)
>
and
<
(3
relative to the corresponding points in Fig. 8 .
The best performance is achieved in a larger range of configurations when each
node processes 8 sub-tasks instead of 32. Therefore, the one-task per worker
,
1
.
0)
,
(65
,
1
.
075)
,
(32
,
8
.
010)
>
 
Search WWH ::




Custom Search