Database Reference
In-Depth Information
with a huge number of data files in a cost-effective fashion. First, we explained the principle
of maintaining data reliability by proactive replica checking. Second, we presented the struc-
ture of PRCR, in which the two major parts, the user interface and the PRCR node, were
presented in detail. Third, we presented the working process of PRCR by following the life
cycle of a data file managed by PRCR in the Cloud. Fourth, we presented two algorithms for
optimizing the performance of PRCR, which are the minimum replication algorithm and the
metadata distribution algorithm. Finally, we presented the evaluation of PRCR, in which the
performance and cost-effectiveness of PRCR were evaluated by comparing with the widely
used conventional three-replica data storage strategy.
In Chapter 7 , we presented our novel energy-efficient data transfer strategy Link Rate Con-
trolled Data Transfer (LRCDT) for reducing the data transfer cost incurred during data
creation or data recovery processes. First, we presented the formulas for calculating data
transfer deadlines for data creation and data recovery processes, respectively. Second, we
presented the Cloud network model for the Cloud with bandwidth reservation, in which
four submodels were presented. Third, we presented the energy consumption model of net-
work devices in the Cloud. Fourth, we presented the LRCDT strategy in detail. Finally, we
presented the evaluation of LRCDT, in which the energy consumption and task completion
time of the strategy were evaluated by comparing with the existing minimum-speed and
maximum speed data transfer strategies.
By presenting all the preceding contents, our cost-effective replication-based Cloud
storage solution for reliability assurance of big data is comprehensively offered. Each
part of the solution, including the data reliability model, algorithm, cost-effective data
reliability assurance approaches during data creation stage, and data maintenance
stage and data recovery stage of the Cloud data life cycle are unfolded to the readers
comprehensively.
8.2
Key contributions of this topic
In this topic, our research focused on the issue of providing cost-effective storage
while meeting the reliability requirement for big data in the Cloud. Based on system-
atic investigations to the existing distributed storage technologies and Cloud storage
and network environments, we provided a systematic cost-effective Cloud data stor-
age solution, in which the data reliability requirement of each data file was consid-
ered throughout the whole data life cycle. Confronting the rapid development of data-
intensive applications in the Cloud and the growth of Cloud data in a dramatic speed,
the significance of this research is obvious. In particular, the major contributions of
this topic can be concluded as follows in four parts:
First, a novel generic data reliability model for Cloud data storage is proposed for describing
the reliability of Cloud data with multiple replicas stored on devices with variable failure
patterns. As far as we know, this model is one of the few that investigate the data replication
techniques with a variable disk failure rate.
Second, a new minimum replication calculation approach is proposed for calculating the
minimum replication that is needed for meeting the data reliability requirement. In addition,
the minimum replication can also act as a benchmark for evaluating the cost-effectiveness
of various replication-based data storage approaches. This approach is able to effectively
 
Search WWH ::




Custom Search