Databases Reference
In-Depth Information
Distribution of the reconstruction time for different values of d
(N=200, s=7, n=14, b=500, MTBF=60 days)
10
[ d=13 ] Mean = 10 cycles
[ d=12 ] Mean = 6 cycles
[ d=11 ] Mean = 5 cycles
[ d=10 ] Mean = 4.6 cycles
d=13
d=12
d=11
d=10
Simulation
6
5
4.6
4.3
4.4
4.3
0
10
20
30
40
50
Reconstruction Time (cycles)
d
Fig. 4. Distribution of reconstruction
time for different values of degree d
Fig. 5. Average Reconstruction Time for
different values of degree d . Smaller d im-
plies more data transfers, but may mean
smaller reconstruction times!
the majority of short reconstructions , from 5.8 to 16.2 cycles (the right side of
the rectangular shape). Hence, in Scenario A, having a good estimate of the tail
of the distribution is not at all sucient to be able to predict the failure rate of
the system. It is necessary to have a good model of the complete distribution!
4.3 Discussion of Parameters of Regenerating Codes
As presented in Section 2, when the redundancy is added using regenerating
codes, n = s + r devices store a fragment of the block, while just s are enough
to retrieve the block. When a fragment is lost d devices, where s
d
n
1,
cooperate to restore it. The larger d is, the smaller is the bandwidth needed
for the repair. Figures 4 and 5 show the reconstruction time for different values
of the degree d . We observe an interesting phenomena: at the opposite of the
common intuition, the average reconstruction time decreases when the degree
decreases: 10 cycles for d = 13, and only 6 cycles for d = 12. The bandwidth
usage increases though (because the δ MBR is higher when d is smaller). The
explanation is that the decrease of the degree introduces a degree of freedom in
the choice of devices that send a sub-fragment to the device that will store the
repaired fragment. Hence, the system is able to decrease the load of the more
loaded disks and to balance more evenly the load between devices .
5 Experimentation
Aiming at validating the simulation and the model results, we performed a batch
of real experimentation using the Grid'5000 platform. It is an experimental
platform for the study of large scale distributed systems. It provides over 5000
computing cores in multiple sites in France, Luxembourg and Brazil. We used a
prototype of storage system implemented by a private company (Ubistorage 2 ).
Our goal is to validate the main behavior of the reconstruction time in a real
environment with shared and constrained bandwidth, and measure how close
they are to our results.
2 http://www.ubistorage.com/
Search WWH ::




Custom Search