Database Reference
In-Depth Information
2 Background
2.1 Presentation of IDA
Information Dispersal Algorithm (IDA), first proposed by O. Rabin in 1989
[Rab89], is used to break a file F of length L = |F| into n pieces F i , i in [1 ..n ],
each of length |F i | = L/m ,sothatevery m pieces suce to rebuild F .The
sum of the lengths F i is ( n/m ) ∗ L . With IDA, since any m pieces, among the n
created, can reconstruct the data, loosing parts of data when stocking or routing
can be remedied.
Mainly, IDA is based on two key parameters. As defined, n is the total number
of pieces resulting from IDA, of which any m chunks are sucient to reconstruct
the original file. m is called ”quorum” or ”threshold”. O. Rabin defines the ratio
between m and n as follows: n/m =1+ e ( e> 0), so it ensures greater reliability
of the algorithm in terms of security and storage capacity.
IDA is based on two routines. The split operation generates the n chunks to
be distributed. The combine routine requires m chunks passed as arguments to
reconstruct the original data.
Both algorithms of division (split) and recomposition (combines) operate by
performing a multiplication of the entry and a key matrix, also called transfor-
mation matrix.
The key matrix must have the property that every subset of m different vectors
are linearly independent. If this is not the case, the key matrix can not be inverted
and the combine phase is not feasible. Indeed, the key matrix is generated from
a ”key”. It is defined as a vector V key composed of a list of elements respecting
the following constraints:
V key :( x 1 , ..., x n ,y 1 , ..., y m ),forevery i , j :
(1) x i + y j =0;(2) i = j → x i = x j et y i = y j
A row of the key matrix is:
1
x i + y 1
1
x i + y m
a i =(
, ...,
)
First, The key matrix A ( n∗m ) is generated respecting the mathematical con-
straints seen above. The next step is to decompose the input data into sequences.
Each sequence Si is expressed as a vector F i . F i is composed of m elements. An
element is of size w and can be a character ( w = 1), a word, etc.. We will call the
matrix composed of the different vectors F i ,thematrix F . Then, F is transposed
to have F t . The multiplication of A by F t generates a matrix R (IDA output).
A chunk M i is a file composed of 1) a header containing the key vector of order
a i (i.e. the i th row of the key matrix), 2) a body storing R i . M i is a chunk to
send to a mapper.
Once m chunks are collected by a host, the key matrix is obtained from the
m headers. The product result of the inverted key matrix by the content of the
chunks gives the original data.
 
Search WWH ::




Custom Search