Global Positioning System Reference
In-Depth Information
where ο ൌ߮ െ߮ , ο ൌߣ െߣ , ߮ ାఝ
, R is the radius of the earth,
&
φ and & N are in radians, and geoDist is in the same unit as R .
A Quick Introduction to MapReduce
MapReduce is one of the main software frameworks for distributed
processing (Dean and Ghemawat 2004). This framework is able to process
massive amounts of data and works by dividing the processing task into
two phases: map and reduce , for which the user provides two functions
named map and reduce . These functions have key-value pairs as inputs and
outputs which have the following general form:
ǣ ݇ͳǡݒͳ ՜– ݇ʹǡݒʹ
—ǣ൫݇ʹǡ–ሺݒʹሻ൯՜–ሺ݇͵ǡݒ͵ሻ
Note that the input and output types of each function can be different.
However, the input of the reduce function should use the same types as the
output of the map function.
The execution of a MapReduce job works as follows. The framework
splits the input dataset into independent data chunks that are processed by
multiple independent map tasks in a parallel manner. Each map call is given
a pair ( k1 , v1 ) and produces a list of ( k2 , v2 ) pairs. The output of the map calls
is known as the intermediate output. The intermediate data is transferred
to the reduce nodes by a process known as the shuffl e . Each reduce node is
assigned a different subset of the intermediate key space; these subsets are
referred as partitions . The framework guarantees that all the intermediate
records with the same intermediate key ( k2 ) are sent to the same reducer
node. At each reduce node, all the received intermediate records are sorted
and grouped. Each formed group will be processed in a single reduce call.
Multiple reduce tasks are also executed in a parallel fashion. Each reduce call
receives a pair ( k2 ,list( v2 )) and produces as output a list of ( k3 , v3 ) pairs.
The processes of transferring the map outputs to the reduce nodes, sorting
the records at each destination node, and grouping these records are driven
by the partition , sortCompare and groupCompare functions, respectively. These
functions have the following form:
partition: k 2 → partitionNumber
sortCompare: ( k 2 1 , k 2 2 ) → {-1,0,1}
groupCompare: ( k 2 1 , k 2 2 ) → {-1,0,1}
The default implementation of the partition function receives an
intermediate key ( k 2) as input and generates a partition number based on
a hash value for k 2. The default sortCompare and groupCompare functions
directly compare two intermediate keys ( k 2 1 , k 2 2 ) and return 1 ( k 2 1 < k 2 2 ),
0 ( k 2 1 = k 2 2 ), or +1 ( k 2 1 > k 2 2 ). The result of using the default comparator
Search WWH ::




Custom Search