Database Reference
In-Depth Information
For all data files managed by PRCR, there are in total six types of metadata at-
tributes, which are file ID, time stamp, data reliability requirement, expected storage
duration, checking interval, and replica addresses.
File ID : It is the unique identiication of the data ile.
Time stamp : It records the time when the last proactive replica checking task for the data ile
was conducted.
The data reliability requirement and expected storage duration : These are requirements for
the storage qualities.
Checking interval : It is the maximum time interval allowed between two consecutive proac-
tive replica checking tasks for the same data ile.
Replica address : It records the location of each replica.
To obtain all these metadata attributes, the ile ID and replica addresses are automatically
given when the original and second replicas of the data ile are created. Time stamp is ini-
tialized with the current time and then updated when the proactive replica checking task is
conducted. The data reliability requirement and expected storage duration can be given by
the storage user and maintained for rebuilding metadata in case of replica loss. The checking
interval can be calculated by using the minimum replication calculation approach.
Among these attributes, the data reliability requirement and expected storage dura-
tion are the only attributes provided by the storage user (default values may apply if they
are not given). In addition to that, all the other storage structure-related attributes are
transparent to the storage user. The checking interval equals the longest storage duration
of the data file while meeting the data reliability requirement. Therefore, starting from
the time that the last proactive replica checking task is conducted, within the checking
interval period, PRCR must check the data file at least once so as to ensure the data reli-
ability assurance is higher than the data reliability requirement. As mentioned in Sec-
tion 5.1 , because of the variable disk failure rate, the longest storage duration of a Cloud
data file varies. Therefore, one or more checking interval values may apply throughout
the life span of the data file in the Cloud. Depending on the attributes of the time stamp
and checking interval, PRCR is able to determine the time that each data file needs to be
checked. According to the replica addresses, all replicas of the data file can be spotted.
6.2.2 PRCR node
The node is the core component of PRCR responsible for the management of the
metadata and replicas of data files. Within each PRCR node, there are two parts: data
table and replica management module, which are for maintaining the metadata of data
files and conducting the proactive replica checking tasks, respectively.
Data table 2 : For all data files that each PRCR node manages, the metadata attributes
are maintained in the data table. To ensure the data reliability of data files, all the
metadata are periodically scanned by the replica management module. The so-called
scan inspects the metadata of a data file in the data table and determines whether
proactive replica checking is necessary. In the data table, each round of the scan is
2 The reliability of the data table itself is beyond the scope of our topic. In fact, a conventional primary-
secondary backup mechanism may well serve the purpose.
 
Search WWH ::




Custom Search