Algorithms for System Expansion - Scalable Continuous Media Streaming Systems

Information Technology Reference

In-Depth Information

Redundant Data Update (SRDU) algorithm for updating redundant data in erasure-coded

distributed storage. This algorithm exploits the structure of Reed-Solomon erasure correction

codes to enable the reuse of old redundant data in generating the new redundant data. This

enables the algorithm to cut down the amount of data transfer by as much as 70% for the SRDU

algorithm. It is worth noting that the algorithms presented in this chapter are not limited to

parallel server systems, but are also applicable to any striped data storage, such as disk arrays,

RAID [1], or even the emerging peer-to-peer systems. For this reason we will use the term

“node” to refer to a device in the striped storage, e.g., a server in a parallel streaming server, a

disk in a RAID, or a user host in a peer-to-peer streaming system.

15.2 Related Work

The problem of data reorganization has been studied previously in the context of disk arrays

[2-3]. The study by Ghandeharizadeh and Kim [2] is the earliest study on data reorganization

known to the author. They investigated the data reorganization problem in the context of adding

disks to a continuous media server. They studied the round-robin data striping commonly found

in disk arrays and proposed techniques to perform data reorganization online , i.e., without

disrupting on-going video streams. Due to the round-robin placement requirement, a large

portion of the data blockswill need to be redistributed tomaintain the data placement orderwhen

a new disk is added, thus incurring significant data reorganization overhead. The advantage

is that this approach can maintain perfect streaming load balance when data reorganization is

completed.

In a more recent study by Goel et al . [3], an algorithm called SCADDAR for data placement

and data reorganization is proposed for use in disk arrays. In this algorithm, each data block

is initially randomly distributed to the disks with equal probabilities. When a new disk is

added to the disk array, each block will obtain a new sequence number according to their

randomized SCADDAR algorithm. If the remainder of this number is equal to the disk number

of the newly added disk, the corresponding block will be moved to this new disk. Otherwise,

the block will stay on the original disk. As SCADDAR no longer needs to maintain a strict

round-robin placement order, it can reduce the reorganization overhead to levels approaching

the theoretical lower bound.

However, the SCADDAR algorithm did not consider streaming load balance. If we ap-

ply SCADDAR to a streaming server, then the pseudo-random placement policy can re-

sult in significant streaming load imbalance, especially after a large number of disks have

been added to the system. This load imbalance complicates data transmission scheduling

and may also reduce the usable streaming capacity and/or increase the response time of the

system.

Interestingly, the previous two studies can be considered as two extremes of the trade-off be-

tween data reorganization overhead and streaming load balance. In particular, Ghandeharizadeh

and Kim's algorithm achieves perfect load balance at the expense of substantial data re-

organization overhead, while the SCADDAR algorithm achieves near-minimal data reor-

ganization overhead at the expense of load imbalance. The Row-Permutated Data Reor-

ganization described in Section 15.3, by comparison, can achieve perfect streaming load

balance and yet incurs significantly lower reorganization overhead than the round-robin

algorithm.

Scalable Continuous Media Streaming Systems

Search WWH ::

Custom Search

Home