Introduction to MongoDB - The Definitive Guide to MongoDB

Database Reference

In-Depth Information

Of course, none of this means that data safety isn't important. MongoDB wouldn't be of much use if you couldn't

count on being able to access the data when you need it. Initially, MongoDB provided a safety net with a feature called

master-slave replication, in which only one database is active for writing at any given time, an approach that is also

fairly common in the RDBMS world. This feature has since been replaced with replica sets , and basic master-slave

replication has been deprecated and should no longer be used.

Replica sets have one primary server (similar to a master), which handles all the write requests from clients.

Because there is only one primary server in a given set, it can guarantee that all writes are handled properly. When a

write occurs it is logged in the primary's 'oplog.

The oplog is replicated by the secondary servers (of which there can be many) and used to bring themselves up

to date with the master. Should the master fail at any given time, one of the secondaries will become the primary and

take over responsibility for handling client write requests.

Implementing Sharding

For those involved with large-scale deployments, auto-sharding will probably prove one of MongoDB's most

significant and oft-used features.

In an auto-sharding scenario, MongoDB takes care of all the data splitting and recombination for you. It makes

sure the data goes to the right server and that queries are run and combined in the most efficient manner possible.

In fact, from a developer's point of view, there is no difference between talking to a MongoDB database with a

hundred shards and talking to a single MongoDB server. This feature is not yet production-ready; when it is,

however, it will push MongoDB's scalability through the roof.

In the meantime, if you're just starting out or you're building your first MongoDB-based website, then you'll

probably find that a single instance of MongoDB is sufficient for your needs. If you end up building the next Facebook

or Amazon, however, you will be glad that you built your site on a technology that can scale so limitlessly. Sharding is

the topic of Chapter 12 of this topic.

Using Map and Reduce Functions

For many people, hearing the term MapReduce sends shivers down their spines. At the other extreme, many RDBMS

advocates scoff at the complexity of map and reduce functions. It's scary for some because these functions require a

completely different way of thinking about finding and sorting your data, and many professional programmers have

trouble getting their heads around the concepts that underpin map and reduce functions. That said, these functions

provide an extremely powerful way to query data. In fact, CouchDB supports only this approach, which is one reason

it has such a high learning curve.

MongoDB doesn't require that you use map and reduce functions. In fact, MongoDB relies on a simple querying

syntax that is more akin to what you see in MySQL. However, MongoDB does make these functions available for those

who want them. The map and reduce functions are written in JavaScript and run on the server. The job of the map

function is to find all the documents that meet a certain criteria. These results are then passed to the reduce function,

which processes the data. The reduce function doesn't usually return a collection of documents; rather, it returns a

new document that contains the information derived. As a general rule, if you would normally use GROUP BY in SQL,

then the map and reduce functions are probably the right tools for the job in MongoDB.

■ You should not think of MongodB's map and reduce functions as poor imitations of the approach adopted by

CouchdB. If you so desired, you could use MongodB's map and reduce functions for everything in lieu of MongodB's

innate query support.

Note

Search WWH ::

Custom Search

Home