Databases Reference
In-Depth Information
Directory services and DNS s are great examples of how different data architecture
patterns are used in conjunction with RDBMS s to provide specialized data services.
Because their data is relatively simple, they don't need complex query languages to be
effective. These highly distributed systems sit at different points in the CAP triangle to
meet different business objectives. NoSQL systems frequently incorporate techniques
used in these distributed systems to achieve different availability and performance
objectives.
In our last section, we'll look at how document revision control systems provide a
unique set of services that NoSQL systems also share.
3.7
Using hash trees in revision control systems
and database synchronization
As we come to our last section, we'll look at some innovations in revision control sys-
tems for software engineering to see how these innovations are being used in NoSQL
systems. We'll touch on how innovations in distributed revision control systems like
Subversion and Git make the job of distributed software development much easier.
Finally, we'll see how revision control systems use hashes and delta mods to synchro-
nize complex documents.
Is it “version” or “revision” control?
The terms version control and revision control are both commonly used to describe
how you manage the history of a document. Although there are many definitions, ver-
sion control is a general term applied to any method that tracks the history of a doc-
ument. This would include tools that store multiple binaries of your Microsoft Word
documents in a document management system like SharePoint. Revision control is
a more specific term that describes a set of features found in tools like Subversion
and Git. Revision control systems include features such as adding release labels
(tags), branching, merging, and storing the differences between text documents.
We'll use the term revision control , as it's more specific to our context.
Revision control systems are critical for projects that involve distributed teams of
developers. For these types of projects, losing code or using the wrong code means
lost time and money. These systems use many of the same patterns you see in NoSQL
systems, such as distributed systems, document hashing, and tree hashing, to quickly
determine whether things are in sync.
Early revision control systems ( RCS s) weren't distributed. There was a single hard
drive that stored all source code, and all developers used a networked filesystem to
mount that drive on their local computer. There was a single master copy that every-
one used, and no tools were in place to quickly find differences between two revisions,
making it easy to inadvertently overwrite code. As organizations began to realize that
talented development staff didn't necessarily live in their city, developers were
 
Search WWH ::




Custom Search