Building a NoSQL-Based Web App to Collect Crowd-Sourced Data - Data Just Right: Introduction to Large-Scale Data and Analytics

Database Reference

In-Depth Information

databases refer to general design concepts rather than strict categories, as features of

one type are often found in the other.

Key-Value Database

It goes without saying that Amazon.com attracts a large number of users, and, out

of necessity, the company has been on the cutting edge of application scalability. A

seminal paper about the nonrelational database technology behind Amazon's Dynamo

database, entitled “Amazon's Highly Available Key-Value Store,” describes a set of use

cases for which relational databases are not ideal. “For many services,” the paper states,

“such as those that provide best seller lists, shopping carts, customer preferences, session

management, sales rank, and product catalog, the common pattern of using a relational

database would lead to inefficiencies and limit scale and availability.” 2 When data vol-

umes grow, reading from and writing to a relational database while maintaining ACID

compliance can be computationally expensive.

A key-value store looks a lot like a big hash table. Each record in the database is an

object identified by a unique key. The value that is addressed by this key can basically

be anything: a string, a JSON or XML document, a binary blob, or a number of other

things. A key characteristic of these databases is that the system doesn't really know

anything about the data being stored; all access to data comes from the key. Key-value

stores are the right choice for applications that generally retrieve database information

based on a single key.

This design makes for very good performance for database writes. Another advan-

tage is that the design lends itself toward being easy to scale and replicate across a net-

work of machines. Because each record in the database is simply a value addressed by

a unique key, the data itself can reside on any machine on the network as long as the

system knows how it can be located. One way to scale this type of database is to keep

a secondary table of key-range to machine mappings. When a piece of data is defined,

this table may be referred to by the secondary table and the request sent to the proper

machine.

On the other hand, a general disadvantage of the design of a key-value store is that

data cannot be accessed by value. In other words, it is impossible to query a key-value

data store for all records that contain a particular set of values. The only way to query a

key-value database is by specifying a request by key or, in some cases, a range of keys.

Listing 3.2 demonstrates some examples of using a key-value store to query for data.

Listing 3.2 Organizing data by key using a key-value data store

> SET book:1 "Data Just Right"

> SET author:1 "Michael Manoochehri"

> GET book:1

"Data Just Right"

Search WWH ::

Custom Search

Home