NOSQL Overview - Graph Databases

Databases Reference

In-Depth Information

Figure A-2. Indexing reifies sets of entities in a document store

Where data hasn't been indexed, queries are typically much slower, because a full search

of the dataset has to happen. This is obviously an expensive task and is to be avoided

wherever possible—and as we shall see, rather than process these queries internally, it's

normal for document database users to externalize this kind of processing in parallel

compute frameworks.

Because the data model of a document store is one of disconnected entities, document

stores tend to have interesting and useful operational characteristics. They should scale

horizontally, due to there being no contended state between mutually independent re‐

cords at write time, and no need to transact across replicas.

Sharding

Most document databases (e.g., MongoDB, RavenDB) require users to plan for shard‐

ing of data across logical instances to support scaling horizontally. Scaling out thus

becomes an explicit aspect of development and operations. (Key-value and column

family databases, in contrast, tend not to require this planning, because they allocate

data to replicas as a normal part of their internal implementation.) This is sometimes

puzzlingly cited as a positive reason for choosing document stores, most likely because

it induces a (misplaced) excitement that scale is something to be embraced and lauded,

rather than something to be skillfully and diligently mastered.

For writes, document databases tend to provide transactionality limited to the level of

an individual record. That is, a document database will ensure that writes to a single

document are atomically persisted—assuming the administrator has opted for safe

Search WWH ::

Custom Search

Home