Database Reference
In-Depth Information
score of v along with the outbound neighbors set is written to DFS for feeding the
next iteration MapReduce job. To stop the iterative process, users have to perform
another MapReduce job after each iteration to measure the difference from the result
of the last iteration.
The map and reduce operations can be described as follows.
Ru
Nu
()
()
PageRank Map: For each node u , output KVs
vd
,
, where v ∈  N + ( u ),
+
and output the retained ranking score and its outbound neighbors set,
1 −
d
.
+
u
,
,
Nu
()
V
PageRank Reduce: For each node v , add the retained rank score 1− d
V
and
Ru
Nu
()
()
the values received from any u d
to update R ( v ), and output KV 〈 v ,
+
[ R ( v ), N + ( v )]〉.
3.1.2.2 K-means
K-means [11] is a commonly used clustering algorithm, which partitions n nodes into
k clusters so that the nodes in the same cluster are more similar than those in other
clusters. We describe the algorithm briefly as follows. (1) Start with selecting k ran-
dom nodes as cluster centroids. (2) Assign each node to the nearest cluster centroid.
(3) Update the k cluster centroids by “averaging” the nodes belonging to the same
cluster centroid. Repeat steps (2) and (3) until convergence has been reached. The
algorithm has converged when the assignments no longer change.
The map and reduce operations for the K-means algorithm can be described as
follows.
K-Means Map: For each key nid (node id), compute the distance from the
node to any cluster centroid, and output the closest cluster id cid along with
the node coordinate information ncoord , that is, output KV 〈 cid , ncoord 〉.
K-Means Reduce: For each key cid (cluster id), update the cluster centroid
coordinate by averaging all the nodes' coordinates that belong to cid , and
output cid along with the cluster's updated centroid coordinate ccoord , that
is, output KV 〈 cid , ncoord 〉.
3.1.2.3 Matrix Power Iteration
Square matrices can be multiplied by themselves repeatedly. This repeated multipli-
k
= 1 . This iterative
process is called matrix power iteration (MPI). The operation of each iteration is
matrix multiplication, that is, M k = M × N , where N = M k -1 . If M is a matrix with
element m ij in row i and column j , and N is a matrix with element n jk in row j and
k
cation can be described as a power of the matrix, that is, M
M
Search WWH ::




Custom Search