Databases Reference
In-Depth Information
Repair Time in Distributed Storage Systems
Frederic Giroire 1 , Sandeep Kumar Gupta 2 , Remigiusz Modrzejewski 1 ,
Julian Monteiro 3 , and Stephane Perennes 1
1 Project MASCOTTE, I3S (CNRS/Univ. of Nice)/INRIA, Sophia Antipolis, France
2 IIT Delhi, New Delhi, India
3 Department of Computer Science, IME, University of Sao Paulo, Brazil
Abstract. In this paper, we analyze a highly distributed backup stor-
age system realized by means of nano datacenters (NaDa). NaDa have
been recently proposed as a way to mitigate the growing energy, band-
width and device costs of traditional data centers, following the popu-
larity of cloud computing. These service provider-controlled peer-to-peer
systems take advantage of resources already committed to always-on set
top boxes, the fact they do not generate heat dissipation costs and their
proximity to users.
In this kind of systems redundancy is introduced to preserve the data
in case of peer failures or departures. To ensure long-term fault tolerance,
the storage system must have a self-repair service that continuously re-
constructs the fragments of redundancy that are lost. In the literature,
the reconstruction times are modeled as independent. In practice, how-
ever, numerous reconstructions start at the same time (when the system
detects that a peer has failed).
We propose a new analytical framework that takes into account this
correlation when estimating the repair time and the probability of data
loss. We show that the load is unbalanced among peers (young peers
inherently store less data than the old ones). The models and schemes
proposed are validated by mathematical analysis, extensive set of simula-
tions, and experimentation using the GRID5000 test-bed platform. This
new model allows system designers to operate a more accurate choice of
system parameters in function of their targeted data durability.
1 Introduction
Nano datacenters (NaDa) are highly distributed systems owned and controlled
by the service provider. This alleviates the need of incentives and mitigates the
risk of malicious users, but otherwise they face the same challenges as peer-to-
peer systems. The set-top boxes realizing them are connected using consumer
links, which can be relatively slow, unreliable and congested. The devices them-
selves, compared to servers in a traditional datacenter, are prone to failures and
temporary disconnections, e.g. if the user cuts the power supply when not in
The research leading to these results has received funding from the European Project
FP7 EULER, ANR CEDRE, ANR AGAPE, Associated Team AlDyNet, project
ECOS-Sud Chile and region PACA.
 
Search WWH ::




Custom Search