Information Technology Reference
In-Depth Information
Although the aggregate transfer rate from the remaining servers is greater than S S , the spare
server can only accept data at a rate of S S . Using erasure correction the spare server will rebuild
one data unit for every ( N S
1) data/redundant units received. We have therefore a rebuild
rate R baseline =
S S /( N S
1) at the spare server.
ρ
/
( N S
Case 2:
(1
1
1))
From equation (14.2):
r
=
S S (1
ρ
)( N S
1)
S S (1
(1
1
/
( N S
1)))( N S
1)
=
S S
As the transfer rate from the remaining servers is less than S S , the corresponding rebuild rate
at the spare server is R baseline =
r
/
( N S
1)
=
S S (1
ρ
).
The rebuild time for a server with storage U is then given by
( N S
1) U
T baseline =
(14.4)
min
{
S S ,
S S (1
ρ
)( N S
1)
}
14.5 Distributed Rebuild
In baseline rebuild, the transfer capacity of the spare server can become the bottleneck even
if the remaining servers have abundant idle capacities available. An alternative approach is to
rebuild the unavailable data units before transferring them to the spare server. In this way, only
the rebuilt data are sent to the spare server and hence the limited transfer capacity of the spare
server can be better utilized.
To achieve this, we can employ a distributed rebuild scheme to distribute the rebuild compu-
tations over all the remaining servers. We first divide the unavailable data into ( N S
1) equal-
size subsets, with each subset then rebuilt by one of the remaining ( N S
1) servers, as shown
in Figure 14.6. The server responsible for a subset will receive the required data/redundant
units from the other ( N S
2) servers, rebuild the unavailable units, and then send the rebuilt
units to the spare server for storage.
To derive the rebuild rate of the distributed rebuild algorithm, we first note that the sum
of transfer rates in and out of the remaining ( N S
1) servers is equal to S S (1
ρ
)( N S
1).
Second, to rebuild each data unit we need 2( N S
2) transfers (half for transmission and the
other half for reception) of data/redundant units from the other ( N S
2) servers to the rebuild
server - the server responsible for rebuilding the unavailable data unit. Note that we need only
( N S
1) because the rebuild server already has one of the
data/redundant units stored locally, and so no transfer over the network is needed. Therefore,
we can compute the rebuild rate R distributed from
2) transmissions instead of ( N S
S S (1
ρ
)( N S
1)
S S (1
ρ
)( N S
1)
R distributed =
=
(14.5)
2( N S
2)
+
1
2 N S
3
which is also the data rate at which rebuilt data are sent to the spare server.
Search WWH ::




Custom Search