Conclusions and future work - Reliability Assurance of Big Data in the Cloud

Database Reference

In-Depth Information

with a huge number of data files in a cost-effective fashion. First, we explained the principle

of maintaining data reliability by proactive replica checking. Second, we presented the struc-

ture of PRCR, in which the two major parts, the user interface and the PRCR node, were

presented in detail. Third, we presented the working process of PRCR by following the life

cycle of a data file managed by PRCR in the Cloud. Fourth, we presented two algorithms for

optimizing the performance of PRCR, which are the minimum replication algorithm and the

metadata distribution algorithm. Finally, we presented the evaluation of PRCR, in which the

performance and cost-effectiveness of PRCR were evaluated by comparing with the widely

used conventional three-replica data storage strategy.

• In Chapter 7 , we presented our novel energy-efficient data transfer strategy Link Rate Con-

trolled Data Transfer (LRCDT) for reducing the data transfer cost incurred during data

creation or data recovery processes. First, we presented the formulas for calculating data

transfer deadlines for data creation and data recovery processes, respectively. Second, we

presented the Cloud network model for the Cloud with bandwidth reservation, in which

four submodels were presented. Third, we presented the energy consumption model of net-

work devices in the Cloud. Fourth, we presented the LRCDT strategy in detail. Finally, we

presented the evaluation of LRCDT, in which the energy consumption and task completion

time of the strategy were evaluated by comparing with the existing minimum-speed and

maximum speed data transfer strategies.

By presenting all the preceding contents, our cost-effective replication-based Cloud

storage solution for reliability assurance of big data is comprehensively offered. Each

part of the solution, including the data reliability model, algorithm, cost-effective data

reliability assurance approaches during data creation stage, and data maintenance

stage and data recovery stage of the Cloud data life cycle are unfolded to the readers

comprehensively.

8.2

Key contributions of this topic

In this topic, our research focused on the issue of providing cost-effective storage

while meeting the reliability requirement for big data in the Cloud. Based on system-

atic investigations to the existing distributed storage technologies and Cloud storage

and network environments, we provided a systematic cost-effective Cloud data stor-

age solution, in which the data reliability requirement of each data file was consid-

ered throughout the whole data life cycle. Confronting the rapid development of data-

intensive applications in the Cloud and the growth of Cloud data in a dramatic speed,

the significance of this research is obvious. In particular, the major contributions of

this topic can be concluded as follows in four parts:

• First, a novel generic data reliability model for Cloud data storage is proposed for describing

the reliability of Cloud data with multiple replicas stored on devices with variable failure

patterns. As far as we know, this model is one of the few that investigate the data replication

techniques with a variable disk failure rate.

• Second, a new minimum replication calculation approach is proposed for calculating the

minimum replication that is needed for meeting the data reliability requirement. In addition,

the minimum replication can also act as a benchmark for evaluating the cost-effectiveness

of various replication-based data storage approaches. This approach is able to effectively

Reliability Assurance of Big Data in the Cloud

Search WWH ::

Custom Search

Home