Algorithms for Server Rebuild - Scalable Continuous Media Streaming Systems

Information Technology Reference

In-Depth Information

Although the aggregate transfer rate from the remaining servers is greater than S S , the spare

server can only accept data at a rate of S S . Using erasure correction the spare server will rebuild

one data unit for every ( N S −

1) data/redundant units received. We have therefore a rebuild

rate R baseline =

S S /( N S −

1) at the spare server.

ρ ≥

−

( N S −

Case 2:

1))

From equation (14.2):

S S (1

− ρ

)( N S −

≤

S S (1

−

( N S −

1)))( N S −

S S

As the transfer rate from the remaining servers is less than S S , the corresponding rebuild rate

at the spare server is R baseline =

( N S −

S S (1

− ρ

The rebuild time for a server with storage U is then given by

( N S −

1) U

T baseline =

(14.4)

min

{

S S ,

S S (1

− ρ

)( N S −

}

14.5 Distributed Rebuild

In baseline rebuild, the transfer capacity of the spare server can become the bottleneck even

if the remaining servers have abundant idle capacities available. An alternative approach is to

rebuild the unavailable data units before transferring them to the spare server. In this way, only

the rebuilt data are sent to the spare server and hence the limited transfer capacity of the spare

server can be better utilized.

To achieve this, we can employ a distributed rebuild scheme to distribute the rebuild compu-

tations over all the remaining servers. We first divide the unavailable data into ( N S −

1) equal-

size subsets, with each subset then rebuilt by one of the remaining ( N S −

1) servers, as shown

in Figure 14.6. The server responsible for a subset will receive the required data/redundant

units from the other ( N S −

2) servers, rebuild the unavailable units, and then send the rebuilt

units to the spare server for storage.

To derive the rebuild rate of the distributed rebuild algorithm, we first note that the sum

of transfer rates in and out of the remaining ( N S −

1) servers is equal to S S (1

− ρ

)( N S −

1).

Second, to rebuild each data unit we need 2( N S −

2) transfers (half for transmission and the

other half for reception) of data/redundant units from the other ( N S −

2) servers to the rebuild

server - the server responsible for rebuilding the unavailable data unit. Note that we need only

( N S −

1) because the rebuild server already has one of the

data/redundant units stored locally, and so no transfer over the network is needed. Therefore,

we can compute the rebuild rate R distributed from

2) transmissions instead of ( N S −

S S (1

− ρ

)( N S −

S S (1

− ρ

)( N S −

R distributed =

(14.5)

2( N S −

2 N S −

which is also the data rate at which rebuilt data are sent to the spare server.

Scalable Continuous Media Streaming Systems

Search WWH ::

Custom Search

Home