Conclusions and future work - Reliability Assurance of Big Data in the Cloud

Database Reference

In-Depth Information

determine the minimum number of replicas for the Cloud data storage with relatively small

computing overhead (i.e., execution time).

• Third, an innovative generic data reliability assurance mechanism named PRCR is proposed

for maintaining big data in the Cloud in a cost-effective fashion, while appropriate data reli-

ability assurances are offered. It is able to provide data reliability management with a wide

range of data reliability requirements efficiently. Compared to the storage using the conven-

tional three-replica strategy, our PRCR can reduce between two-thirds and one-third of the

storage cost, while the running overhead for PRCR itself is negligibly small.

• Fourth, an innovative energy-efficient data transfer strategy named LRCDT is proposed

for reducing the cost of data transfer activities that are intensively involved in data cre-

ation and recovery processes. The strategy could balance the trade-off between data trans-

fer speed and energy consumption, and hence could benefit the cost-effective storage for

data reliability in both the data creation stage and the data recovery stage. LRCDT is

able to significantly reduce the data transfer energy consumption during data creation and

data recovery processes, in which up to 33.7% of the energy consumption by using the

minimum-speed strategy or 63% by using the maximum-speed strategy can be reduced.

Such an energy saving outcome is achieved by sacrificing some data transfer time but

without jeopardizing the deadline.

8.3

Further discussion and future work

In this section, we first present some further discussions related to the PRCR mecha-

nism, and then the future work of the research in this topic is presented.

8.3.1 Further discussions

Cloud data storage concerns not only reliability but also other concerns such as avail-

ability and data access performance. These other concerns are not yet addressed in this

book. With the “no more than two replicas” storage fashion of PRCR, there could po-

tentially be some side effects so that the data availability and data access performance

are affected. However, it does not mean that storing more than two replicas for the data

are not feasible with PRCR. Based on certain needs, any number of replicas can be

created, and PRCR can certainly maintain all of them.

In addition, another thing that needs to be discussed is the generality of PRCR.

As mentioned in Section 3.2 , our research is based on the Cloud with a replication-

based data storage scheme. However, PRCR is generic rather than specifically for a

replication-based data scheme only. The combination of PRCR with an erasure cod-

ing-based data storage scheme could be feasible for increasing the reliability of era-

sure coded data. Similar to what PRCR does to data replicas, it could proactively

check the erasure-coded data blocks periodically, and recover lost data blocks when

found. Erasure-coded data could be recovered before k data blocks are lost and less

than n data blocks are available. By applying PRCR, the probability of losing k data

blocks can be reduced, so that the reliability of the data could be improved. For the

combination of PRCR with erasure coding-based data storage, the data reliability

model and data recovery process can be further investigated.

Search WWH ::

Custom Search

Home