Information Technology Reference
In-Depth Information
which is exactly the same as equation (14.11). We state the result in the following theorem:
Theorem 14.2. The mixed distributed-baseline rebuild scheme achieves the optimal rebuild
rate and hence requires the minimum amount of rebuild time.
Proof. The result follows directly from equations (14.11) and (14.15).
14.6.3 Controlling the Rebuild Time
The previous sections focus on deriving the rebuild rate and time. It is clear that the rebuild
time increases with server utilization
ρ
. To control the rebuild time, the server could limit its
utilization
to reserve transfer capacities for the rebuild process.
First, by setting
ρ
ρ =
0 in equation (14.15) we can obtain the minimum achievable rebuild
time:
U
R max | ρ = 0 =
2( N S
1) U
N S S S
T min =
(14.16)
Second, if we want to complete the rebuild process by time t ( t
T min ), we will need to limit
the server utilization
ρ
to
1
N S
2 U
tS S
U ( N S
1)
1
+
1
for T min
t
S S
ρ
(14.17)
U ( N S
U
tS S
1)
>
1
t
S S
by means of admission control.
14.7 Numerical Results
To illustrate and compare performances of the rebuild algorithms we consider a system of
N S =
5 active servers and one spare server. Each server has 200GB storage, so the system has
a total of 1TB storage, including the redundant units. We assume a server transfer capacity of
600Mbps, e.g., using Gigabit Ethernet links.
Figure 14.7 plots the data rebuild rate versus server utilization for all the rebuild algorithms.
We include the data rebuild rate for reloading data from back-up for the sake of compari-
son. Note that this data rebuild rate is also the upper bound. We observe that for baseline
rebuild, the data rebuild rate is constant at S S /
1)), even if the
remaining active servers are lightly loaded and have idle capacities available. As the system
utilization approaches one, the rebuild rate drops quickly. Distributed rebuild performs better
than baseline rebuild when the server utilization is low (e.g.,
( N S
1) for
ρ
(1
1
/
( N S
55), but it deteriorates
earlier when the system utilization increases. This is because in distributed rebuild the active
servers need to receive data transmissions from other servers in addition to sending data to
other servers, and thus consume considerably more transfer capacity than baseline rebuild.
Finally, as expected, the mixed distributed baseline rebuild gives the best performance in all
cases.
ρ
0
.
Search WWH ::




Custom Search