Database Reference
In-Depth Information
Summary
The most common architecture for storing data, the relational database model, was a
result of the work of database pioneers such as Edgar Codd. The relational database
model is designed to provide consistency, a f lexible query model, and predictability.
Many Web and mobile applications must handle both a constant barrage of incoming
data and the need to scale up predictably as the amount of users and clients grows. As
data sizes get larger and the need for databases to be tolerant of faults increases, the
effort needed to scale and replicate relational database systems tends to make them
impractical for high-throughput applications with huge data volumes. A common solu-
tion to deal with the problem of massive amounts of data is to use alternative archi-
tectures that eschew the traditional architectural choices of relational databases. These
are often referred to broadly as NoSQL technologies. Two of the most popular non-
relational databases are key-value stores and document stores. Key-value data stores
allow each record in a database to be accessed by a single key. The data does not need
to match a pre-existing schema. This architecture allows for very fast performance, but
key-value stores lack the ability to query data by value. In contrast, document stores
provide the ability to query against the document itself. Document stores are excellent
choices when the data retrieved is best used in single-document form (such as Web site
content) or when your database schema is very fluid.
Even in the crowded world of open-source nonrelational data stores, various solu-
tions are designed to excel for one particular use case or another. Some database tech-
nologies are designed to be performant under heavy load, at the expense of consistency
across nodes. Others types of databases specialize in being as easy to scale across a clus-
ter of machines or as flexible with schema changes as possible. For small- or medium-
sized applications that require a strong guarantee of consistency and a f lexible querying
model, relational databases are still the best choice.
For applications that require high throughput for database writes, a great choice
is to use a key-value data store. Much like a hash table, key-value architecture stores
data as a collection of unique key-value pairs, resulting in very quick data storage and
retrieval. This speed comes at the expense of being able to query data by value. Unlike
a document store, only the value of the key can be used to access data. The most
popular open-source technology that uses this approach is Redis, which combines
an in-memory key-value system with automatic snapshots to disk. Fault tolerance
can be provided to some degree by configuring snapshots of data to a persistent disk.
Although the ability to completely hold a dataset in memory is both a source of speed
and a potential liability, Redis can be used in a distributed manner using client-side
sharding. Twemproxy, which provides a hashing proxy layer that automatically dis-
tributes keys to a pool of Redis instances, is currently the best way to shard a database
across a pool of separate Redis instances.
The distributed database space is still evolving rapidly. A number of new software
solutions are combining the structured compliance of Edgar Codd's relational model
with the potential for scalability found in key-value and document databases.
 
 
Search WWH ::




Custom Search