Database Reference
In-Depth Information
of replicas for data storage in our data reliability assurance solution, the minimum
replication could also be used as a benchmark for evaluating different approaches.
It shows the theoretical minimum data redundancy level of a replication-based data
storage system without jeopardizing the data reliability requirement. By using this
benchmark, the cost-effectiveness as well as the ability of providing data reliability
assurance of a replication-based data storage system can be clearly presented as de-
scribed next.
Given data file set F ( f 1 , f 2 , f 3 , , f m ) managed by replication-based system S ( d 1 , d 2 ,
d 3 , , d n ) with the data reliability requirement set of RR ( r 1 , r 2 , r 3 , , r m ), where f i ( r i 1 ,
r i 2 , r i 3 , , r ip ) indicates a data file in F and d q indicates a disk in S. r i j ( d q ) indicates
the j th replica of f i , which is stored in disk d q. In order to avoid searching the disks for
storing all the replicas of the data file, the disk failure rate patterns are obtained from
randomly selected disks. For each, apply the minimum replication approach for each
f i in F , and the minimum replication min i for each f i , can be obtained. The minimum
replication level for storing data file set F can be described as equation (5.9) :
m
min
i
MIN
=
i
=
1
(5.9)
S
m
When the current replication level in system S is close to MIN S , it means that the
data stored in the system are maintained cost effectively. However, when the current
replication level is lower than MIN S , it means that the data redundancy level of the
system is too low to provide sufficient data reliability assurance, so that the data reli-
ability requirement could be jeopardized.
5.3
Evaluation of the minimum replication
calculation approach
In this section, we briefly present the results of our evaluation on the minimum rep-
lication calculation approach so as to provide an intuitive understanding of the ef-
fectiveness of the approach. The evaluation is conducted by running a minimum rep-
lication algorithm. The algorithm is essentially the implementation of the minimum
replication approach, which runs as a part of our data reliability assurance mechanism
to be presented in Chapter 6 . As the minimum replication algorithm is described in
Chapter 6 , details of the experiments will be presented in Chapter 6 as well.
During the evaluation we evaluate the algorithm under different data reliability re-
quirements and with different configurations including failure rate types and calcula-
tion equations. The evaluation is conducted from the aspects of execution time of the
algorithm and the accuracy rate of the output of the optimized algorithm compared with
the original algorithm (see Section 6.5 for more details). The execution time of the al-
gorithm addresses the computing overhead of the minimum replication calculation ap-
proach, while the accuracy rate of the algorithm output addresses the effectiveness of our
optimization to the minimum replication calculation approach presented in Section 5.1 .
 
Search WWH ::




Custom Search