Databases Reference
In-Depth Information
l
write are obtained from Eqs. ( 2.7 )and( 2.23 ), respectively. The Replicate
phase is the same as in the basic version of S-CLONE, except that in searching for
the best server to replicate each user, we do not consider those servers that violate
the above new constraint.
write
s
and L
4.3
Numerical Results
In an effort to substantiate the efficiency of S-CLONE, a preliminary evaluation
of S-CLONE has been conducted in [ 38 ]. This evaluation is applied to the basic
version of S-CLONE for the special case where all users in the social graph are
equally active (by setting W D R D 1) and the strength of a social relationship
is either 0 (no relationship) and 1 (socially connected). We discuss this evaluation
here as a way to illustrate the importance of social locality in data replication and by
taking advantage of how S-CLONE performs in comparison with a random-based
technique like Cassandra. The dataset used for the evaluation was obtained from the
Max-Planck Software Institute for Software Systems [ 39 ], containing a Facebook
network of N D 63; 392 users in New Orleans region, with 816,886 links resulting
an average degree of 25.7. This is the same dataset used in the evaluation of S-PUT
discussed in the previous chapter. In the evaluation, the number of storage servers
M varies in the set [ 8 , 16 , 32 ], and the number of replicas K in Œ1; M 1.
S-CLONE is evaluated on top of two partitions of the social graph, which are
resulted from applying random partitioning and METIS partitioning, respectively.
Random partitioning resembles a DHT-based key-value scheme `ala Cassandra to
randomly assign each user to a server. METIS partitioning is a graph partitioning
scheme [ 19 ] that aims to divide a graph into K equal-sized components such that the
social links are maximally preserved in each component. This method is the same
method used in S-PUT discussed in the previous chapter. The purpose of evaluating
with METIS for partitioning is to see how the replication schemes compare in the
case where the partition itself does already preserve some degree of social locality.
Figures 4.1 - 4.3 plot the read cost per user which is the ratio of the total read load
to the number of users, for cases M D 8, M D 16,andM D 32, respectively.
First, we discuss the case where replication is not used, i.e., each user has only one
server location, which is its primary location. Using random partitioning to assign
user data to the servers the average read cost is about 24 server reads per user when
the number of servers is eight (Fig. 4.1 a) and this cost increases slightly as more
servers are deployed, to a cost of about 27 when there are 32 servers (Fig. 4.3 a). If
we use METIS partitioning, the average read cost is about 8 server reads per user
when the number of servers is eight (Fig. 4.1 b), reaching a cost of 12 with 32 servers
(Fig. 4.3 b).
In the case where replication is used, with K replicas per user, the read cost
improves. Regardless of the partitioning method underneath, random replication
improves linearly with the number of replicas, whereas S-CLONE offers a more
Search WWH ::




Custom Search