Databases Reference
In-Depth Information
interesting pattern. The improvement in S-CLONE is more significant in early
increases of the number of replicas but not so significant after a sufficiently large
number. For example, in the case M D 32 when S-CLONE is applied on top of the
random partition (Fig. 4.3 a), the read cost of S-CLONE drops quickly from 24 to 5
as K increases from 1 to 17, but afterwards the decrease is less significant. The drop
is quicker if S-CLONE is applied on top of the METIS partition. This implies that
although both the random partition and METIS partition offer comparable degrees
of load balancing, to achieve the same improvement rate for the total read load,
we need fewer replicas per user if METIS partitioning is used than if random
partitioning is used. The reason, we conjecture, is because METIS does preserve
social locality whereas random partitioning does not. Consequently, social locality
should be considered highly in the storage design.
The superiority of S-CLONE to random replication is obvious, especially when
more servers are deployed or when METIS is used for partitioning instead of random
partitioning. For example, on top of random partitioning when M D 32,inorder
to achieve a read cost of 5, S-CLONE requires 17 replicas per user (i.e., K D 11)
but random replication requires 26 replicas. On top of METIS partitioning when
M D 32, S-CLONE requires just 3 replicas per user but random replication requires
19 replicas. It is thus important that we take social locality into account not only
when we store the primary data, but also when we replicate it.
We also observe that, for each given M , there is a value for K that maximizes
the efficiency gap between S-CLONE and random replication. For example, in the
case M D 32 (Fig. 4.3 a), this value is K D 15. The gap is narrower as K is
approaching towards 1 or towards M 1. This is understandable because in these
two extreme cases there is no substantial difference in the replica placement using
either partitioning scheme. It will be interesting though to derive a formula for the
optimal value of K that will maximize the efficiency gap.
In terms of load balancing, Fig. 4.4 plots the Gini coefficient of S-CLONE for
cases M D 8, M D 16,andM D 32 when it is applied on top of the random
partition and the METIS partition. It is observed that S-CLONE balances the load
better when more servers are deployed or when more replicas are allowed per user.
The Gini coefficient is at most 0.35 when eight servers are deployed and at most
0.17 when 32 servers are deployed. These values are acceptable given the fact that
S-CLONE starts with an existing partition and the results are obtained for the basic
version of S-CLONE with load balancing being the secondary objective, not the
primary. We expect better Gini coefficient for the enhanced version of S-CLONE
which enforces a stricter constraint on load balancing
4.4
Notes
For OSNs that already employ an arbitrary data partition structure, whose data
need to be replicated, we can increase the extent of social locality during the
replication procedure. S-CLONE is a socially aware replication scheme which,
Search WWH ::




Custom Search