Database Reference
In-Depth Information
approach in the Cloud. The data reliability model should be able to describe the reli-
ability of the Cloud data throughout their life cycles, in which they are stored with dif-
ferent redundancy levels and stored on different storage devices with different failure
rate patterns in different stages, respectively.
To facilitate our research, our data reliability model should be consistent with the
preceding analysis conducted as well as the literature reviews conducted in Chapter 2 .
Therefore, first, from the hardware aspect, our data reliability model should be able to
precisely describe the relationship between data reliability and the failure pattern of
storage devices. As we mentioned in Section 2.1 , storage device failure is the source
of storage failure and data loss. Precise description of the impact of storage devices to
data reliability could substantially improve the ability of the model to predict data reli-
ability, that is, the data loss rate, after the data are stored for a certain period of time.
Second, the data reliability model must be able to describe the reliability of Cloud
data stored in the form of replicas. The number of replicas represents the redundancy
level of the data. In the data reliability model, the relationship between data reliability
level and the number of replicas needs to be reflected. Third, in order to describe the
reliability of Cloud data throughout their life cycles, the model must be able to reflect
the changes in replica number, that is, data redundancy level, so as to correspond to the
life cycle stages of data creation, data maintenance, and data recovery.
3.2.4.2 Minimum replication calculation and benchmark
When metadata such as data size, expected data storage duration, and data reliability
requirements are collected and the corresponding storage device is determined, the in-
terface between the Cloud and the storage user, if necessary, needs to determine the
minimum replica number that is needed for the purpose of creating data replicas. The
calculation should be fast and of low overhead. Moreover, in order to facilitate the data
maintenance mechanism, it is necessary that the minimum replication calculation ap-
proach also predicts the reliability of the data that are stored for a certain period of time.
However, with a variable disk failure rate pattern, the overhead of such a calculation
could be a concern, and hence optimization needs to be conducted to reduce the over-
head of the data reliability prediction process.
3.2.4.3 Cost-effective data reliability assurance mechanism
For the maintenance of the Cloud data throughout the Cloud data life cycle, we need
to design a data reliability assurance mechanism that could replace the conventional
three-replica data storage strategy in current Clouds. There are three major challenges
as follows for the design of a cost-effective data reliability assurance mechanism in
the Cloud.
First, the mechanism should be running in a cost-effective fashion so that the Cloud data
storage cost can be reduced. This requires not only the reduction of replica number, but also
the overhead incurred for conducting the mechanism to be considered.
Second, the mechanism should be able to effectively utilize the computation and storage
power of the Cloud, so that the big data in the Cloud could be managed properly.
Search WWH ::




Custom Search