Database Reference
In-Depth Information
symbols are added to provide protection from storage failures where m = n k . The
redundancy level or code rate is n/k.
The erasure coding approaches have been developed for a long time and widely
used for providing data reliability assurance. For example, the simplest even and odd
parity is used by RAID 5 to achieve redundancy, in which if a drive in the array fails,
remaining data on the other drives can be combined with the parity data (using the
Boolean XOR function) to reconstruct the missing data [64] . Reed-Solomon (RS)
codes are widely used in producing CDs, DVDs, or Blu-ray disks; building RAID
6 data arrays; storing data in mass storage systems [5] , and so forth. Some hybrid
research studies that combine replication and erasure coding or analyze differences
between replication and erasure coding have also been conducted [65,66] . In one
study [65] , a solution was proposed, referred to as “fusion” that uses a combination of
erasure codes and selective replication for tolerating multiple crash faults over mul-
tiple data structures in general distributed systems. In another study [66] , the analy-
sis between replication and erasure coding storage solutions for P2P systems was
conducted, where the authors stated that erasure coding can significantly reduce the
self-repair bandwidth. Recently, research studies for erasure coding storage solutions
in Clouds have also been seen [5,67] . In one study [67] , an erasure coding approach
using the Reed-Solomon 10 + 4 codes was applied to HDFS-RAID storage systems
at Facebook. And in another study [5] , Local Reconstruction Codes (LRC) 6 + 3 or
12 + 4 codes were applied to part of the Windows Azure Storage service.
Unlike data replication approaches for storage, erasure-coding approaches divide
data into several different data blocks, modify the original data, and store the data
with additional erasure coding blocks. By using erasure-coding approaches, the data
reliability can be assured at a quite high level. Compared to data replication, erasure-
coding approaches have better performance at reducing storage redundancy and data
recovery bandwidth. However, the computing overhead for coding and decoding era-
sure-coded data is very high. For example, in one study [68] , the decoding time for
a data block of 16 MB using Tornado Z codes is at a magnitude of tens to hundreds
of seconds. Such a performance is somewhat even above the average performance of
other erasure codes, such as the Reed-Solomon codes.
2.3
Data transfer for distributed systems
Data recovery is a very important aspect of data reliability management. No matter
which data redundancy approach is applied, the lost data must always be recovered
when possible so that the redundancy can be maintained at a satisfactory level. Data
recovery approaches are highly dependent on the data storage schema of the distrib-
uted storage systems. For systems with either replication-based or erasure coding-
based data storage schema, different replication levels or erasure codes could lead to
different data recovery strategies [5] . However, for recovering data in a large distrib-
uted storage system, there is one universal principle: When the data is lost, the lost
data (either already restored to the form of the lost data or not) need to be transferred
from somewhere (to somewhere else) to recover the original status of the data, and
 
Search WWH ::




Custom Search