Information Technology Reference
In-Depth Information
(
UID
m
(
vv, rs
)
∈
DataBase
=
VersionVector
×
→
IdRecord
)
m
→
vv
∈
VersionVector
=
MachineId
Numeral
-set
r
∈
IdRecord
=
{
fid
:
FileId, gvsn
:
GVSN, parent
:
UID,
clock
:
Numeral
, name
:
Name
, live
:
bool
}
gvsn
∈
GVSN
=
MachineId
×
Numeral
Global version sequence number
uid
∈
UID
= Globally unique identifier
Version Vectors.
File replication systems typically use global version sequence
numbers, which are pairs (Unique Machine Identifier, Version Sequence Num-
ber), to identify a resource and its version globally. The version sequence num-
ber is a local time-stamp, which can be assumed monotonically increasing with
changes. A version vector is a map from machine identifiers to version sequence
numbers. They typically map a machine identifier to a single number, but in the
case of DFS-R we found that allowing the vectors to map to a set of numbers
(represented compactly as intervals of numerals) allowed handling, for instance,
synchronization disruptions. Version vectors are also known as vector clocks.
Version vectors are used to record a state of
knowledge
, as the vector indicates
a water-mark of versions that have been received from other machines.
We may think of a version vector as a set of
GVSN
pairs obtained by taking
{
. Similarly, one can form a version vector from
asetof
GVSN
pairs. In the future we will switch between the set and map view
of version vectors depending on the context. Thus,
vv
[
m
] is defined for each
m
.
It is the empty set if
m
(
m, v
)
|
[
m
→
vs
]
∈
vv
∧
v
∈
vs
}
∈
Dom(
vv
)asamap.
Database Records.
A file (we will use file to also refer to a directory) is
identified globally through a unique identifier
uid
, while the per file system file
identifier is the
fid
. The set of database records may be indexed by a
uid
and
a
fid
. Each record stores a global version sequence number
gvsn
that tracks the
version of the file, a file name
name
, a reference to a parent directory
parent
,
and an indication whether the resource represents an existing file on the file
system. If
live
is false, we call the resulting record a
tombstone
.The
clock
field is
a Lamport clock [6], it gets incremented with every file update. Lamport clocks
are used to enforce causal ordering per record by assuming a total lexicographic
ordering on
GVSN
and define:
r<r
iff
r.clock < r
.clock
(
r.clock
=
r
.clock
r.gvsn < r
.gvsn
)(1)
∨
∧
We will later establish that property (5), which only uses version vectors, suces
for detecting conflicts (absence of causality) among
all
replicated files. Neverthe-
less, this property is significant, as the number of replicated files in the context
of DFS-R is much larger than the number of replicating machines.
The records in DFS-R contain a number of additional fields, such as file hashes,
file creation time and file attributes.
Local and Global Consistency.
We are now in a position where we can state
the main soundness properties that DFS-R aims to achieve:
Search WWH ::
Custom Search