S-CLONE - Data Storage for Social Networks: A Socially Aware Approach

Databases Reference

In-Depth Information

l

write are obtained from Eqs. ( 2.7 )and( 2.23 ), respectively. The Replicate

phase is the same as in the basic version of S-CLONE, except that in searching for

the best server to replicate each user, we do not consider those servers that violate

the above new constraint.

write

s

and L

4.3

Numerical Results

In an effort to substantiate the efficiency of S-CLONE, a preliminary evaluation

of S-CLONE has been conducted in [ 38 ]. This evaluation is applied to the basic

version of S-CLONE for the special case where all users in the social graph are

equally active (by setting W D R D 1) and the strength of a social relationship

is either 0 (no relationship) and 1 (socially connected). We discuss this evaluation

here as a way to illustrate the importance of social locality in data replication and by

taking advantage of how S-CLONE performs in comparison with a random-based

technique like Cassandra. The dataset used for the evaluation was obtained from the

Max-Planck Software Institute for Software Systems [ 39 ], containing a Facebook

network of N D 63; 392 users in New Orleans region, with 816,886 links resulting

an average degree of 25.7. This is the same dataset used in the evaluation of S-PUT

discussed in the previous chapter. In the evaluation, the number of storage servers

M varies in the set [ 8 , 16 , 32 ], and the number of replicas K in Œ1; M 1.

S-CLONE is evaluated on top of two partitions of the social graph, which are

resulted from applying random partitioning and METIS partitioning, respectively.

Random partitioning resembles a DHT-based key-value scheme `ala Cassandra to

randomly assign each user to a server. METIS partitioning is a graph partitioning

scheme [ 19 ] that aims to divide a graph into K equal-sized components such that the

social links are maximally preserved in each component. This method is the same

method used in S-PUT discussed in the previous chapter. The purpose of evaluating

with METIS for partitioning is to see how the replication schemes compare in the

case where the partition itself does already preserve some degree of social locality.

Figures 4.1 - 4.3 plot the read cost per user which is the ratio of the total read load

to the number of users, for cases M D 8, M D 16,andM D 32, respectively.

First, we discuss the case where replication is not used, i.e., each user has only one

server location, which is its primary location. Using random partitioning to assign

user data to the servers the average read cost is about 24 server reads per user when

the number of servers is eight (Fig. 4.1 a) and this cost increases slightly as more

servers are deployed, to a cost of about 27 when there are 32 servers (Fig. 4.3 a). If

we use METIS partitioning, the average read cost is about 8 server reads per user

when the number of servers is eight (Fig. 4.1 b), reaching a cost of 12 with 32 servers

(Fig. 4.3 b).

In the case where replication is used, with K replicas per user, the read cost

improves. Regardless of the partitioning method underneath, random replication

improves linearly with the number of replicas, whereas S-CLONE offers a more

Data Storage for Social Networks: A Socially Aware Approach

Search WWH ::

Custom Search

Home