Global Positioning System Reference
In-Depth Information
The MapReduce framework partitions the intermediate data generated
by map tasks. This partitioning is performed calling the Partition_base
function presented in Algorithm 3. Partition_base receives an intermediate
key, i.e., k 2 = ( part,win ), as input and generates the corresponding partition
number. C 1− C 3 are constant prime numbers and NUMPARTITIONS is the
maximum number of partitions set by the MapReduce framework. The
partition number for an intermediate key that corresponds to a base partition
is computed using a hash function on part (line 2). When the key corresponds
to a window-pair partition, the partition number is computed using a hash
function on min ( part,win ) and max ( part,win ) (line 6). This last hash function
will generate the same partition number for all intermediate records of a
window-pair partition independently of the specifi c window they belong
to. In the scenario of Fig. 5, Partition_base will generate the same partition
number, i.e. ( P 0 × C 1) mod NUMPARTITIONS , for all the intermediate keys
that correspond to partition P 0. Similarly, the function will generate the
same partition number, i.e., ( P C 2+ P C 3) mod NUMPARTITIONS , for
all the intermediate keys that correspond to partition P 0_ P 1.
Algorithm 3: Partition_base ()
Input: k 2. k 2 = ( part,win )
Output: k 2's partition number
1: if win = 1 then // base partition
2: partition ← ( part × C 1) mod NUMPARTITIONS
3: else // window-pair partition
4: minVal ← min ( part,win )
5: maxVal ← max ( part,win )
6: partition ← ( minVal × C 2 + maxVal × C 3) mod NUMPARTITIONS
7: end if
After identifying the partition numbers of intermediate records, the
shuffl e phase of the MapReduce job sends the intermediate records to
their corresponding reduce nodes. The intermediate records received at
each reduce node are sorted and grouped using the Compare_base function
presented in Algorithm 4. The main goal of the Compare_base function is
to group the intermediate records that belong to the same partition. The
function establishes the order of partitions shown in Fig. 6a. Base partitions
have a lower order than window-pair partitions. Multiple base partitions
are ordered based on their pivot indices. Multiple window-pair partitions
are ordered based on the two associated pivot indices of each partition.
Compare_base receives as input two intermediate record keys, i.e., k 2 1 ,
k 2 2 , and returns 0 (when k 2 1 and k 2 2 belong to the same group), 1 (when
k 2 1 has lower order than k 2 2 ), or +1 (when k 2 1 has higher order than k 2 2 ).
The algorithm considers all the possible combinations of the intermediate
Search WWH ::




Custom Search