Information Technology Reference
In-Depth Information
- Global consistency: Databases of machines connected in a network are equal
except for the contents of the fid fields.
- Local consistency: On each machine, the database records the content on
the file system.
A very substantial part of DFS-R consists in maintaining local consistency. DFS-
R uses the NTFS change journal, which for every file operation produces a record,
accessible from a special file. The change journal presents an incremental way to
obtain file changes. Since DFS-R only tracks files that are replicated, it further-
more needs to scan directories that are moved in and out of the replicated folders.
Also, the local consistency algorithms need to take into account that change jour-
nals wrap , that is, not all consecutive changes are available for DFS-R, and that
change journals are deleted, resized and/or re-created by administrators. We
will here concentrate only on global consistency as it illustrates the distributed
protocol problems later in this paper.
So for the rest of the discussion, we will use simplified definitions of machines
and database records. While this approach makes things look much simpler than
reality, it allows us to concentrate on the specific topics in this paper.
( UID m
m
Machine
= VersionVector
×
IdRecord )
×
inbound
r
IdRecord
=
{
gvsn : GVSN, parent : UID, clock : Numeral ,
name : Name , live : bool
}
3.2
Operations
The main operations relevant to file replication consist of local file system activity
and synchronization.
The file system operations called Create , Update , Rename and file Delete in
Fig. 2. cause the local version vector to be updated with a fresh version for the
machine that performs the change. The database records are also updated to
reflect the new file system state.
We assume an initial state consisting of an arbitrary network of machines all
sharing a single replicated root folder and no other files. We use tuples with mu-
table fields in the guarded commands, and we omit checks for whether elements
are in the domain of a map prior to accesses.
A direct way to synchronize two data-bases is by merging version vectors
and traversing all records on a sending machine m 2 ; those records whose keys
do not exist on the receiving machine m 1 are inserted. Records, that dominate
existing records on m 1 are also inserted. Fig. 3. illustrates the proposed scheme.
The scheme implements a last-writer wins strategy, as later updates prevail
over earlier updates. We will later realize that the check v
vv 1 [ m ]isinfact
redundant. Another property of this scheme is that each update is processed
independently. Notice that this is an implementation choice, which comes with
limitations. Conflict resolution that can only perform decisions based on a single
record cannot detect that a machine swapped the names of two files. Namely,
suppose machine m 1 and m 2 share two files named a and b .Then m 2 renames a
 
Search WWH ::




Custom Search