Database Reference
In-Depth Information
2.5 The Communication Cost Model
In this section we shall introduce a model for measuring the quality of algorithms imple-
mented on a computing cluster of the type so far discussed in this chapter. We assume the
computation is described by an acyclic workflow, as discussed in Section 2.4.1 . For many
applications, the bottleneck is moving data among tasks, such as transporting the outputs
of Map tasks to their proper Reduce tasks. As an example, we explore the computation of
multiway joins as single MapReduce jobs, and we see that in some situations, this approach
is more efficient than the straightforward cascade of 2-way joins.
2.5.1
Communication-Cost for Task Networks
Imagine that an algorithm is implemented by an acyclic network of tasks. These tasks could
be Map tasks feeding Reduce tasks, as in a standard MapReduce algorithm, or they could
be several MapReduce jobs cascaded, or a more general workflow structure, such as a
collection of tasks each of which implements the workflow of Fig. 2.6 . 7 The communica-
tion cost of a task is the size of the input to the task. This size can be measured in bytes.
However, since we shall be using relational database operations as examples, we shall often
use the number of tuples as a measure of size.
The communication cost of an algorithm is the sum of the communication cost of all the
tasks implementing that algorithm. We shall focus on the communication cost as the way
to measure the efficiency of an algorithm. In particular, we do not consider the amount of
time it takes each task to execute when estimating the running time of an algorithm. While
there are exceptions, where execution time of tasks dominates, these exceptions are rare in
practice. We can explain and justify the importance of communication cost as follows.
• The algorithm executed by each task tends to be very simple, often linear in the
size of its input.
• The typical interconnect speed for a computing cluster is one gigabit per second.
That may seem like a lot, but it is slow compared with the speed at which a pro-
cessor executes instructions. Moreover, in many cluster architectures, there is com-
petition for the interconnect when several compute nodes need to communicate at
the same time. As a result, the compute node can do a lot of work on a received
input element in the time it takes to deliver that element.
• Even if a task executes at a compute node that has a copy of the chunk(s) on which
the task operates, that chunk normally will be stored on disk, and the time taken
to move the data into main memory may exceed the time needed to operate on the
data once it is available in memory.
Search WWH ::




Custom Search