Information Technology Reference
In-Depth Information
disks so that at least one copy is available after a disk failure. Examples include the rota-
tional mirrored declustering scheme proposed by Chen et al . [4], the doubly striped mirror-
ing scheme proposed by Mourad [6], and the random duplicated assignment proposed by
Korst [8].
Another approach makes use of parity encoding for data redundancy. A parity block together
with a number of data blocks forms a parity group. The entire parity group can be reconstructed
even if one of the blocks in the parity group is lost in a disk failure. Compared to replication,
parity encoding generally requires less redundancy overhead, but higher buffer requirement for
data reconstruction. This approach has been investigated by Tobagi et al . [1] in their Streaming
RAID architecture, by Cohen et al . [3] in their pipelined disk array, by Berson et al . [2] in their
non-clustered scheme, and by Ozden et al . [5] in their declustered parity scheme and prefetch
scheme.
In another work by Cohen and Burkhard [7], a segmented information dispersal (SID)
scheme was proposed to allow fine grain trade-off between the two extremes of mirroring and
RAID-5 parity encoding. Reconstruction reads under SID are contiguous, leading to better disk
efficiency. The authors showed that the SID schemes match the performance of RAID-5 and
schemes based on balanced incomplete block designs under normal mode, and outperforms
them under degraded mode of operation.
The previous studies all focus on the normal mode and degraded mode of operation. The
problem of rebuilding data in a failed disk to a spare disk in a media server has received little
attention. While there are many existing studies on disk rebuild, they have all focused on data
applications such as online transaction processing (OLTP) servers. Some examples are the
work by Menon and Mattson [10-11], Hou et al . [12-13], Thomasian and Menon [14-16],
Mogi and Kitsuregawa [17], and so on.
Disk rebuild in media server applications, however, differs from that of OLTP applications
in two major ways. First, OLTP applications generally do not have the stringent performance
requirement of a media server. In particular, performance of OLTP applications is commonly
measured using response time. While shorter response time is desirable, it is not a condition
for correct operation. Therefore, in disk rebuild, the focus is to balance service response time
with rebuild time. For example, one can use priority scheduling in OLTP applications to give
higher priority to normal requests to minimize their response time and to serve rebuild requests
with the unused disk time.
By contrast, a media server has to guarantee the retrieval of media data according to a
fixed schedule. Even a small delay beyond the schedule will result in service disruption.
Consequently, the rebuild process can take place only if normal media data retrievals can still
be completed on time. This requires detailed disk modeling and the use of worst-case analysis
to determine exactly how much disk time can be spent on the rebuild process. Unlike rebuild
algorithms for OLTP applications, the amount of disk time to spend on rebuild is determined
a priori , given the disk parameters. Moreover, retrievals for playback data and rebuild data
are scheduled to minimize disk-seek time instead of according to priority as in the OLTP
case.
Second, OLTP applications commonly employ the RAID-5 striping scheme to maximize I/O
concurrency [9]. On the other hand, media server applications commonly employ the RAID-3
striping scheme for reasons to be discussed in Section 5.3.1. This fundamental difference in the
striping scheme, together with the inherently round-based disk scheduling algorithm employed
in media servers, requires different designs for the rebuild algorithm.
Search WWH ::




Custom Search