Distributed Programming for the Cloud - Large Scale and Big Data: Processing and Management

Database Reference

In-Depth Information

by implementing an affinity-based scheduler that applies a work-stealing algorithm

to minimize the amount of data movement across machines. This modified sched-

uler strikes a balance between exploiting the locality of previously computed results

and executing tasks on any available machine to prevent straggling effects. On the

runtime, instances of incremental Map tasks take advantage of previously stored

results by querying the memoization server. If they find that the result has already

been computed, they fetch the result from the location of their memoized output and

conclude. Similarly, the results of a reduce task are remembered by storing them

persistently and locally where a mapping from a collision-resistant hash of the input

to the location of the output is inserted in the memoization server.

Since a reduce task receives input from n map tasks, the key stored in the

memoization server consists of the hashes of the outputs from all n map task that

collectively form the input to the reduce task. Therefore, when executing a reduce

task, instead of immediately copying the output from the map tasks, the reduce task

consults map tasks for their respective hashes to determine if the reduce task has

already been computed in previous run. If so, that output is directly fetched from the

location stored in the memoization server, which avoids the re-execution of that task.

The M 3 system [10] has been proposed to support the answering of continuous

queries over streams of data bypassing the HDFS so that data gets processed only

through a main-memory-only data path and totally avoids any disk access. In this

approach, mappers and reducers never terminate where there is only one MapReduce

job per query operator that is continuously executing. In M 3 , query processing is

incremental where only the new input is processed, and the change in the query

answer is represented by three sets of inserted (+ ve ), deleted (- ve ), and updated

( u ) tuples. The query issuer receives as output a stream that represents the deltas

(incremental changes) to the answer. Whenever an input tuple is received, it is trans-

formed into a modify operation (+ ve , - ve , or u ) that is propagated in the query execu-

tion pipeline, producing the corresponding set of modify operations in the answer.

Supporting incremental query evaluation requires that some intermediate state be

kept at the various operators of the query execution pipeline. Therefore, mappers

and reducers run continuously without termination, and hence, can maintain main-

memory state throughout the execution. In contrast to splitting the input data based

on its size as in Hadoops input split functionality, M 3 splits the streamed data based

on arrival rates where the rate split layer, between the main-memory buffers and the

mappers, is responsible for balancing the stream rates among the mappers. This layer

periodically receives rate statistics from the mappers and accordingly redistributes

the load of processing amongst mappers. For instance, a fast stream that can over-

flow one mapper should be distributed among two or more mappers. In contrast, a

group of slow streams that would underflow their corresponding mappers should be

combined to feed into only one Mapper. To support fault tolerance, input data is rep-

licated inside the main memory buffers and an input split is not overwritten until the

corresponding mapper commits. When a mapper fails, it re-reads its corresponding

input split from any of the replica inside the buffers. A mapper writes its intermedi-

ate key-value pairs in its own main memory and does not overwrite a set of key-value

pairs until the corresponding reducer commits. When a reducer fails, it rereads its

corresponding sets of intermediate key-value pairs from the mappers.

Large Scale and Big Data: Processing and Management

Search WWH ::

Custom Search

Home