Database Reference
In-Depth Information
1.2.2.3 Data recovery
At the data maintenance stage of the Cloud data life cycle, replicas could be lost due
to storage failures. In order to either restore the redundancy level of the Cloud data or
prevent the data from total loss, data recovery is needed. At this stage, certain mecha-
nisms are conducted to recover the lost replicas. For various purposes, these mecha-
nisms follow different data recovery policies and the duration of the data recovery
stage could vary. From the data reliability aspect, the data need to be recovered before
the data reliability assurance becomes too low to meet the storage user's requirement.
1.2.2.4 Data deletion
When the data are no longer needed, they are deleted. The storage space reclamation
mechanism of the Cloud (if any) then recycles the pre-occupied storage space, and the
life cycle of the Cloud data ends. Hence this stage of the Cloud data life cycle will not
be discussed in this topic any further. However, as we will explain later in the topic,
for determining the proper data reliability assurance that meets the storage user's data
reliability requirement, it is preferable that the expected storage duration be given
when the data are created.
1.3
Key issues of research
The research in this topic involves two major aspects: cost-effective data storage and
data reliability. On one hand, the storage cost highly depends on the redundancy level
of the data. By reducing the redundancy of the Cloud data, the storage cost could be
reduced proportionally. Due to the massive amount of big data in the Cloud, the stor-
age cost saved can be huge. On the other hand, reducing redundancy also means that
the data reliability may be jeopardized, that is, the data cannot survive until they are
deleted (or discarded). In order to provide cost-effective data storage while meeting
the data reliability requirement of the Cloud storage users throughout the Cloud data
life cycle, our research involves the following key issues.
1. Data reliability model
First, we need a model to describe Cloud data reliability and Cloud data reliability-related
factors, which is essential for the design of the data reliability assurance approach in the
Cloud. The data reliability model should be able to describe the reliability of the Cloud data
throughout their life cycles, in which the data are stored with different redundancy levels and
stored on different storage devices at different stages respectively.
2. Determination of the minimum replication
In order to reduce the storage cost in the Cloud, we need to determine the minimum data
redundancy level for meeting the data reliability requirement. As will be further explained in
Chapter 3 , our research focuses on the data reliability issue in the Cloud with a replication-
based data storage scheme. Therefore, in order to store the Cloud data in a cost-effective
fashion, at the data creation stage of the Cloud data life cycle, the number of replicas created
for the Cloud data need to be minimized. Based on the data reliability model, we need an
approach that predicts the data reliability under certain given replication levels so that the
 
Search WWH ::




Custom Search