Information Technology Reference
In-Depth Information
5
Reliable and Fault-Tolerant
Storage Systems
In addition to streaming capacity, reliability is also an important issue in the deployment of
media streaming services. In particular, a high-capacitymedia server will likely be equipped
with a large disk array comprising many disks. Failure in any one of these disks, however,
will cripple the entire media server. This is why a RAID is often employed to enable the
media server to sustain the rare but possible disk failures.
Nevertheless, even though the media server can continue operation after a disk failure, the
failed disk and the data it contain will eventually need to be replaced. Otherwise the media
server will be susceptible to data loss in case of additional disk failures. In this chapter we
address this issue and investigate rebuild algorithms to automatically rebuild data stored in a
failed disk into a stand-by spare disk. The rebuild process is automatic, i.e., does not require
human intervention, and is transparent to the on-going streaming service. We investigate
both block-based and track-based rebuild algorithms and present buffer sharing techniques
to reduce the buffer requirement. Our results show that automatic rebuild of a failed disk
can be completed in a reasonable amount of time even at relatively high server utilization
(e.g., less than 1.5 hours at 90% utilization), thus contributing to improve the availability
of the media server.
5.1 Introduction
Since the introduction of media servers, a large number of researchers have investigated ways
to improve server capacity to cope with the bandwidth requirement in delivering high-quality
audio-visual contents to a large number of users. Apart from the challenge of capacity, another
challenge - reliability - readily comes into the picture when companies deploy paid services
to a large user population.
Specifically, a media server usually employs multiple disks in the form of a disk array for
media data storage and retrieval. Media data are then distributed evenly across all the disks
in small units so that data retrieval for a media stream will spread across all disks for load
Search WWH ::




Custom Search