Database Reference
In-Depth Information
StateK join(StructureK).
The join interface is used to specify the mapping rules between state KVs
and structure KVs. Given a structure key, a corresponding state key is
returned.
void map(StateK, StateV, StructureV).
The map interface in iMapReduce has one input key StateK and two input
values: the state data value StateV and the structure data value StructureV.
iMapReduce framework joins the state KVs and the structure KVs auto-
matically. The StateV and the StructureV are the joined state value and
structure value.
void reduce(IK, [IV]).
The reduce interface in iMapReduce is the same as that in MapReduce,
with an input key IK and a list of input values [IV]. Note that the input value
only contains state data information but no structure data information. It
will output a state KV.
float distance(StateK, PrevStateV, CurrStateV).
Users implement the distance interface to specify the distance measure-
ment using a state key's previous state value PrevStateV and its current
state value CurrStateV. The returned float values for different keys are accu-
mulated to obtain the distance value between two consecutive iterations'
results. For example, Manhattan distance and Euclidean distance can be
used to quantify the difference.
In addition, iMapReduce provides the following job parameters (i.e., JobConf's
parameters) to help users specify an iterative computation:
job.set("mapred.iterjob.state.path," path).
Set the DFS path of the initial state data.
job.set("mapred.iterjob.structure.path," path).
Set the DFS path of the structure data.
job.setInt("mapred.iterjob.maxiter," n ).
Set the maximum iteration number n to terminate an iterative computation.
job.setFloat("mapred.iterjob.disthresh," d ) .
Set the distance threshold as d , which is used to terminate an iterative
computation.
3.4.2 P age r ank i mPlementation e XamPle
To show how to implement iterative algorithms in iMapReduce, an example of
PageRank algorithm implementation code is given in Figure 3.4. In PageRank,
the state KVs and the structure KVs have “one-to-one” mapping. Each node has
a PageRank score as the state value and its neighbors set as the structure value.
The join interface specifies that node n 's structure data corresponds node n 's state
data (Line 1). In the map function (Line 2-5), each node's PageRank score is evenly
distributed to its neighbors and retaining (
1− d
N
)
by itself, where N is the total number
Search WWH ::




Custom Search