Information Technology Reference
In-Depth Information
( UID m
( vv, rs )
DataBase
= VersionVector
×
IdRecord )
m
vv
VersionVector = MachineId
Numeral -set
r
IdRecord
=
{
fid : FileId, gvsn : GVSN, parent : UID,
clock : Numeral , name : Name , live : bool
}
gvsn
GVSN
= MachineId
×
Numeral Global version sequence number
uid
UID
= Globally unique identifier
Version Vectors. File replication systems typically use global version sequence
numbers, which are pairs (Unique Machine Identifier, Version Sequence Num-
ber), to identify a resource and its version globally. The version sequence num-
ber is a local time-stamp, which can be assumed monotonically increasing with
changes. A version vector is a map from machine identifiers to version sequence
numbers. They typically map a machine identifier to a single number, but in the
case of DFS-R we found that allowing the vectors to map to a set of numbers
(represented compactly as intervals of numerals) allowed handling, for instance,
synchronization disruptions. Version vectors are also known as vector clocks.
Version vectors are used to record a state of knowledge , as the vector indicates
a water-mark of versions that have been received from other machines.
We may think of a version vector as a set of GVSN pairs obtained by taking
{
. Similarly, one can form a version vector from
asetof GVSN pairs. In the future we will switch between the set and map view
of version vectors depending on the context. Thus, vv [ m ] is defined for each m .
It is the empty set if m
( m, v )
|
[ m
vs ]
vv
v
vs
}
Dom( vv )asamap.
Database Records. A file (we will use file to also refer to a directory) is
identified globally through a unique identifier uid , while the per file system file
identifier is the fid . The set of database records may be indexed by a uid and
a fid . Each record stores a global version sequence number gvsn that tracks the
version of the file, a file name name , a reference to a parent directory parent ,
and an indication whether the resource represents an existing file on the file
system. If live is false, we call the resulting record a tombstone .The clock field is
a Lamport clock [6], it gets incremented with every file update. Lamport clocks
are used to enforce causal ordering per record by assuming a total lexicographic
ordering on GVSN and define:
r<r
iff r.clock < r .clock
( r.clock = r .clock
r.gvsn < r .gvsn )(1)
We will later establish that property (5), which only uses version vectors, suces
for detecting conflicts (absence of causality) among all replicated files. Neverthe-
less, this property is significant, as the number of replicated files in the context
of DFS-R is much larger than the number of replicating machines.
The records in DFS-R contain a number of additional fields, such as file hashes,
file creation time and file attributes.
Local and Global Consistency. We are now in a position where we can state
the main soundness properties that DFS-R aims to achieve:
 
Search WWH ::




Custom Search