Databases Reference
In-Depth Information
4.2
Algorithm
S-CLONE consists of two phases: replicate and adjust .
1. Replicate phase: The procedure starts in this phase. It works in a greedy manner,
sequentially considering a node at a time and finding the best way to replicate its
data. Suppose that the nodes are processed in the order f 1, 2, ..., N g . For each
node i (initially node 1) that has not been processed (i.e., all the nodes 1, 2, ...,
i 1 have been processed), the K replicas are assigned to their corresponding
servers as follows. First, the location histogram below is computed for i
X
N
h i .s/ D
r j p js e ji
(4.1)
j D 1
where h i .s/ is the number of i 's neighbor nodes who are primarily stored at
server s. Then, the top K servers (with highest h i values) are chosen to each
store a replica for user i . If there is a tie, the server with the least write load (so
far) is preferred. A special case occurs where there are not enough K servers in
the histogram; in this case, the remaining replicas for node i will be placed in the
Adjust phase below.
2. Adjust phase: The goal of this phase is to find the servers for the remaining
replicas that cannot be placed due to the special case aforementioned above.
Because the read cost is the same no matter where to store these replicas, their
locations are chosen to maximize load balancing. The best way to do this is to
process each remaining replica one by one and place it on the server that currently
has the least load.
After the K replicas of i have been placed, the algorithm moves on to process the
next user, node .i C 1/. Note that in choosing the K servers to replicate the data
for a user, we do not consider its primary server because it makes no sense to put
a replica and its primary on the same server. Also, as only the location information
of the primary copies is used to determine where to replicate a user, the order to
process the users does not have impact on the final replica placement. This is a
desirable property because there may be different structures to represent a social
graph and S-CLONE can work with any structure unbiasedly.
The algorithm above is the basic version of S-CLONE, which treats load
balancing as the secondary objective. For systems that require a better degree of
load balancing, we can extend S-CLONE by enforcing the following constraint:
X
N
X
N
write
s
write
8 s W l
D
w i .p is C x is / .1 C "/.L
=M / D .1 C "/.K C 1/
w i =M
i D 1
i D 1
i.e., the write load of each server cannot exceed the average write load by a factor
of ", a small value representing the desired load balancing. Here, the formulas for
Search WWH ::




Custom Search