Database Reference
In-Depth Information
reused. To ensure this, we introduce a modification to the scheduler used by Hadoop,
to incorporate a notion of affinity . The new scheduler takes into account affinities
between machines and tasks by keeping a record of which nodes have executed
which tasks. This allows for scheduling tasks in a way that decreases the movement
of memoized intermediate results, but at the cost of a potential degradation of job
performance due to stragglers [38]. This is because a strict affinity of tasks results
in deterministic scheduling, which prevents a lightly loaded node from performing
work when the predetermined node is heavily loaded. Our scheduler therefore needs
to strike a balance between work stealing and affinity of memoized results. Section
4.5 describes our modified scheduler.
4.3 INCREMENTAL HDFS
In this section, we present tncremental HDFS (Inc-HDFS), a distributed file system
that enables stable incremental computations in Incoop, while keeping the interface
provided by HDFS. Inc-HDFS builds on HDFS, but modifies the way that files are
partitioned into chunks to use content-based chunking, a technique that was intro-
duced in LBFS [31] for data deduplication. At a high-level, content-based chunking
defines chunk boundaries based on finding certain patterns in the input, instead of
using fixed-size chunks. As such, insertions and deletions cause small changes to the
set of chunks. In the context of MapReduce, this ensures that the input to map tasks
remains mostly unchanged, which translates into a stable recomputation. Figure 4.2
illustrates the differences in the strategies for determining chunk boundaries in
HDFS and Inc-HDFS. To perform content-based chunking, we scan the entire file,
examining the contents of a fixed-width window whose initial position is incre-
mented one byte at a time. For each window, we compute its Rabin fingerprint, and if
the fingerprint matches a certain pattern (called a marker ) we place a chunk bound-
ary at that position. In addition, this approach can be extended to avoid creating
chunks that are too small or too large, which could affect the overheads and load
balancing properties of MapReduce. (Note that all the system designer can tune is
the likelihood of finding a marker, but the actual spacing depends on the input.) This
is achieved by setting minimum and maximum chunk sizes: after we find a marker
m i at position p i , we skip a fixed offset O and continue to scan the input after position
p i + O . In addition, we bound the chunk length by setting a marker after M content
bytes even if no marker is found. Despite the possibility of affecting stability in rare
cases, for example, when skipping the offset leads to skipping disjoint sets of mark-
ers in two consecutive runs, we found this to be a very limited problem in practice.
An important design decision is whether to perform chunking during the creation
of the input or when the input is read by the map task. We chose the former because
the cost of chunking can be amortized when chunking and producing the input data
are done in parallel. This is relevant in cases where the generation of input data is not
limited by the storage throughput.
To parallelize the chunking process on multicore machines, our implementation
uses multiple threads, each of which starts the search for the marker at a different
position. The markers that each thread finds cannot be used immediately to define the
chunk boundaries, since some of them might have to be skipped due to the minimum
Search WWH ::




Custom Search