Database Reference
In-Depth Information
> GET author:1
"Michael Manoochehri"
Some examples of open-source key-value stores are Apache Cassandra and
LinkedIn's Project Voldemort. We will take a closer look at building a scalable data
solution using the open-source Redis database, which is the most popular in-memory
key-value store, later in this chapter.
Document Store
Every day, we interact with numerous documents of various types, both physical and
virtual, such as business cards, receipts, tax returns, and playlists. Some of these docu-
ments have similar characteristics, such as the time they were created or the informa-
tion they might contain about a particular person. Other documents contain data
completely unique to the document type; an online application may have any number
of different fields, for example. The data from this variety of documents might be dif-
ficult to express using the rigid schemas found in relational databases. Not only that:
What if the schema of a variety of documents needed to be changed? In these cases, it
might be the right time to look into using a document store.
A document store is a type of database that stores data as a collection of—you
guessed it—documents. These documents themselves may be XML representations,
JSON objects, and even specific binary formats (see Chapter 2 for a closer look at
these formats). In contrast to a relational database—in which every record in a table
must adhere to the same schema—a document store can contain a variety of records
with completely different schemas. In other words, each record might have a com-
pletely different structure. Although this is also true of most key-value stores, the dif-
ference is that document stores usually allow the user to ask questions about the actual
data in the database, rather than interrogating simply using the key.
A canonical example that illustrates the differences between a document store and
a relational database can be found in serving the information necessary to construct
a page for a typical blog. Blog pages not only feature page content and a title but also
additional content such as an author name, links to related posts, and even user com-
ments. If this information was stored in a relational database, the queries necessary to
build a single page would require accessing a large number of tables.
The user of a document store takes a different approach; all of the content for a
single page is stored in a single, large record. These records remain independent of
one another, and changing one does not affect the rest of the blog post records. If one
of the blog pages contains a completely different chunk of information (say, links to
photo URLs for a slideshow), this information can be added to any document without
worrying about the schema of the others. The relational database, on the other hand,
would represent all of the information as relationships between existing, normal-
ized tables. If a slideshow feature was needed, a new “slideshow” table, with a strictly
defined schema, would need to be created. In addition, relationships to the rest of the
content of the page would need to be defined, likely by a key relating it to a unique
blog post ID.
 
Search WWH ::




Custom Search