Database Reference
In-Depth Information
Cost-effective data reliability
assurance for data maintenance
6
In this chapter, we present our novel cost-effective data reliability assurance mecha-
nism named Proactive Replica Checking for Reliability (PRCR), which is for main-
taining Cloud data in a cost-effective fashion. PRCR has the following features:
By coordinating with the minimum replication calculation approach and data recovery ap-
proach, PRCR maintains Cloud data files with the minimum replication level, in which no
more than two replicas are created for each data file.
By using the abundant Cloud computing resources in the form of Cloud computing instanc-
es, PRCR is able to maintain big data in the Cloud with a huge number of Cloud data files
with flexibility, while a wide variety of data reliability assurance can be provided to meet
storage users' reliability requirements.
By checking the replicas of each data file regularly in a proactive fashion, PRCR is able to
detect any replica loss incident and cooperate with the data recovery process. In this way,
PRCR makes sure the data reliability assurance is not jeopardized in overall terms.
Compared with the huge number of Cloud data files that PRCR is able to maintain, the run-
ning overhead of PRCR that can be neglected is very small. By using PRCR for the data
reliability management, the excessively generated data replicas in current Clouds can be
minimized, so that the storage cost could be significantly reduced.
The structure of this chapter is organized as follows. In Section 6.1 , we explain
how proactive replica checking can be used for providing data reliability assurance. In
Section 6.2 , we present the high-level structure of PRCR. In Section 6.3 , more detailed
design of PRCR is presented, in which we present the working process of PRCR for
maintaining a Cloud data file throughout its life cycle. In Section 6.4 , two algorithms
for optimizing PRCR are presented including the minimum replication algorithm for
determining the minimum number of replicas and the metadata distribution algorithm
for maximizing the utilization of the PRCR capacity. In Section 6.5 , evaluation for
PRCR is presented, in which we evaluate PRCR from aspects of performance and
cost-effectiveness. Finally, in Section 6.6 we summarize the works presented in this
chapter. This section is mainly based on our work [33] .
6.1
Proactive replica checking
There is a well-known property of exponential distribution called memoryless property ,
which is that for all
( | ) ( ) . In other words, for
given T > s , the probability distribution of T from time s to s + t is equivalent to that
from time 0 to t . For data reliability specifically, this property denotes that as long as
we know the data file is not lost at any given moment, the probability of the data file
surviving for the next time t follows the same probability distribution.
st , 0 , there are >+ >= >
>
PT stTsPT t
 
 
Search WWH ::




Custom Search