Database Reference
In-Depth Information
Introduction
1
With the rapid growth in the size of Cloud data, cost-effective data storage has become
one of the key issues in Cloud research, yet the reliability of the huge amounts of
Cloud data needs to be fully assured. In this topic, we investigate the trade-off of cost-
effective data storage and data reliability assurance in the Cloud. The novel research
stands from the Cloud storage service providers' perspective and investigates the issue
on how to provide cost-effective data storage service while meeting the data reliability
requirement throughout the whole Cloud data life cycle. This topic is important and
has a practical value to Cloud computing technology. Especially, for data-intensive
applications that are of data-intensive characteristics, our research could dramatically
reduce its storage cost while meeting the data reliability requirement and hence has a
positive impact on promoting the deployment of the Cloud.
This chapter introduces the background knowledge and key issues of this research.
It is organized as follows. Section 1.1 gives the definition of data reliability and briefly
introduces current data reliability assurance technologies in the Cloud. Section 1.2
introduces the background knowledge related to Cloud storage. Section 1.3 outlines
the key issues of the research. Finally, Section 1.4 presents an overview for the topic
structure.
1.1
Data reliability in the Cloud
The term “reliability” is widely used as an aspect of the service quality provided by
hardware, systems, Web services, etc. In Standard TL9000, it is defined as “the abil-
ity of an item to perform a required function under stated conditions for a stated time
period” [1] . For data reliability specifically, which refers to the reliability provided
by the data storage services and systems for the stored data, it can be defined as “the
probability of the data surviving in the system for a given period of time” [2] . While
the term “data reliability” is sometimes used in the industry as a superset of data avail-
ability and various other topics, in this topic we will stick to the definition of data
reliability given earlier.
Data reliability indicates the ability of the storage system to keep data consistent,
hence it is always one of the key metrics of a data storage/management system. In
large-scale distributed systems, due to the big quantity of storage devices being used,
failures of storage devices occur frequently [3] . Therefore, the importance of data reli-
ability is prominent, and these systems need better design and management to cope
with frequent failures. Increasing the data redundancy level could be a good way for
increasing data reliability [4,5] . Among several major approaches for increasing the
data redundancy level, data replication is currently the most popular approach in dis-
tributed storage systems. At present, data replication has been widely adopted in many
 
 
Search WWH ::




Custom Search