Information Technology Reference
In-Depth Information
Step3: Compute the minimal preference:
1 max s ,
(10)
2max max s , ,s ,
(11)
Step4: Compute the minimal preference:
12
(12)
Step5: Compute the step:
(13)
Step6: Initialize the preferences:
(14)
Step7: Update the preferences:
(15)
The range of preference value will between and .
In the mapping stage, every mapper will get different divided data from the HDFS
and processing the AP using the assigned input data set. And then each mapper de-
cides its own similarity matrix and preference values. After AP, the mappers will get
their own results just like single node processing. So that each mapper has its own
information like centers and key values. Mappers send this information to the next
reducing stage., where < key, i, k > means point i 's center is point k .
All parameters transmit among multiple mappers and the reducer. The
Map/Reduce unit will collect the values with the same key value and process the val-
ues at the same time.
The mappers need to process their own data isolatedly. There are two restrictions
using AP on Map/Reduce. First, there are many iterations in AP. If the mappers
transmit information to each other, unpredictable cost may occur. Second, in this
Map/Reduce architecture, every mapper is processing independently. It cannot trans-
mit data to other mappers.
There are two parts in the reducing stage. First, the ReducerA collects the informa-
tion from each mapper, and using this information to calculate the center points then
sends the result to ReducerB. Second, the ReducerB collects all centers of clusters,
that each mapper maybe calculates different centers. To decide the suitable centroid
of centers from ReducerA, we use
Centroid
(16)
After merge clusters, the points of the merged clusters will have the same unique
key value if the points are in the same cluster. Clustered points and their centers com-
bine the output result.
Search WWH ::




Custom Search