Database Reference
In-Depth Information
In order to simplify the computation of equation (5.7) , our solution contains two
major steps:
First, based on the discrete disk failure rate pattern that is applied in our generic data re li-
ability model, the average disk failure rate can be converted into a piecewise function λ ()
of storage duration. According to the disk failure rate pattern of the disk and the start time of
the storage period, the average disk failure rate can be calculated by following a piecewise
function containing n subfunctions, in which n is the number of different disk failure rates
contained in the disk failure rate pattern after the start time. By doing this, equation (5. 7)
is transformed into an equation in which t is the only independent variable, with variable λ
being eliminated.
Second, after the first conversion of equation (5.7) , the previous equation has now been
converted into a piecewise function, which equals to several functions, each covering a spe-
cific period of storage duration. Due to the increment in the number of equations that need
to be solved to obtain the longest storage duration value, the solving process is still time
consuming and expensive in terms of overhead. To optimize the performance of the solv-
ing process, the data reliability equation is further simplified for reducing the computat io n
complexity. It is observed that th e curve of data reliability with a single replica (i.e.,
e t )
changes almost linearly when λ t is in a certain ra nge. Therefore, in this value range, the
curve can be substituted by a straight line with λ t being the dependent variable without
sacrificin g much ac curacy of the result. Assuming that the function of the substituted straight
line is λ=+
λ
ftat b
()
, equation (5.7) can be simplified into equation (5.8) :
(1)1 (1
=− −−−−
atbatb
t
λ
)(1
λ
)
1
2
RA
(5.8)
k
As the average disk failure rate can be expressed as a first-degree piecewise func-
tion of t , equation (5.8) is essentially a quartic function of t . Compared to many com-
plicated equation-solving methods, such as trust-region equation-solving algorithms
[86] , for solving the original nonpolynomial equation (5.7) , the simplified equation
(5.8) can be solved by the methods for solving polynomial equations, which are much
more efficient, and hence the overhead calculation can be significantly reduced.
In addition to the simplification described earlier, addressing the issue of solving
the equation for multiple times, optimizations are also conducted. In order to avoid
any excessive overhead incurred for solving equation (5.8) for multiple times, the
multiple calculations are conducted in one go when the data file is first created in the
Cloud. As long as replicas of the data file are not lost, the solving process does not
need to be conducted again, and hence resulting in better efficiency.
In Chapter 6 , the minimum replication calculation approach is applied for our ge-
neric data reliability assurance mechanism where we present the pseudo code of the
approach with the mechanism together then.
5.2
Minimum replication benchmark
By solving the corresponding inequations and equations mentioned in Section 5.1 , the
minimum replication, that is, the minimum number of replicas required for meeting
the data reliability requirement is determined. In addition to find the minimum number
 
Search WWH ::




Custom Search