NoSQL concepts - Making Sense of NoSQL

Databases Reference

In-Depth Information

Figure 2.5 To get a feel for how expensive it is to access your hard drive compared to

finding an item in RAM cache, think of how long it might take you to pick up an item in

your back yard (RAM). Then think of how long it would take to drive to a location in your

neighborhood (SSD), and finally think of how long it would take to pick up an item in Los

Angeles if you lived in Chicago (HDD). This shows that finding a query result in a local

cache is more efficient than an expensive query that needs HDD access.

You can do a few trillion calculations while you're waiting for your data to get back

from LA . That's why calculating a hash is much faster than going to disk, and the more

RAM you have, the lower the probability you need to make that long round trip.

The solution to faster systems is to keep as much of the right information in RAM

as you can, and check your local servers to see if they might also have a copy. This local

fast data store is often called a RAM cache or memory cache . Yet, accomplishing this and

determining when the data is no longer current turn out to be difficult questions.

Many memory caches use a simple timestamp for each block of memory in the

cache as a way of keeping the most recently used objects in memory. When memory

fills up, the timestamp is used to determine which items in memory are the oldest and

should be overwritten. A more refined view can take into account how much time or

resources it'll take to re-create the dataset and store it in memory. This “cost model”

allows more expensive queries to be kept in RAM longer than similar items that could

be regenerated much faster.

The effective use of RAM cache is predicated on the efficient answer to the ques-

tion, “Have we run this query before?” or equivalently, “Have we seen this document

before?” These questions can be answered by using consistent hashing , which lets you

know if an item is already in the cache or if you need to retrieve it from SSD or HDD .

2.4

Using consistent hashing to keep your cache current

You've learned how important it is to keep frequently used data in your RAM cache,

and how by avoiding unnecessary disk access you can improve your database perfor-

mance. NoSQL systems expand on this concept and use a technique called consistent

hashing to keep the most frequently used data in your cache.

Consistent hashing is a general-purpose process that's useful when evaluating how

NoSQL systems work. Consistent hashing quickly tells you if a new query or

Search WWH ::

Custom Search

Home