Database Reference
In-Depth Information
cluster ). And although Oracle can do this with its impressive Real Application Clusters
(RAC) architecture, you can expect to take out a mortgage if you want to use that
solution—implementing a RAC-based solution requires multiple servers, shared storage,
and several software licenses.
You might wonder why having an active/active cluster on two databases is so
difficult. When you query your database, the database has to find all the relevant data
and link it all together. RDBMS solutions feature many ingenious ways to improve
performance, but they all rely on having a complete picture of the data available. And
this is where you hit a wall: this approach simply doesn't work when half the data is on
another server.
Of course, you might have a small database that simply gets lots of requests, so you
just need to share the workload. Unfortunately, here you hit another wall. You need
to ensure that data written to the first server is available to the second server. And you
face additional issues if updates are made on two separate masters simultaneously.
For example, you need to determine which update is the correct one. Another problem
you can encounter: someone might query the second server for information that has
just been written to the first server, but that information hasn't been updated yet on the
second server. When you consider all these issues, it becomes easy to see why the Oracle
solution is so expensive—these problems are extremely hard to address.
MongoDB solves the active/active cluster problems in a very clever way—it avoids
them completely. Recall that MongoDB stores data in BSON documents, so the data
is self-contained. That is, although similar documents are stored together, individual
documents aren't made up of relationships. This means that everything you need is all in
one place. Because queries in MongoDB look for specific keys and values in a document,
this information can be easily spread across as many servers as you have available. Each
server checks the content it has and returns the result. This effectively allows almost
linear scalability and performance. As an added bonus, it doesn't even require that you
take out a new mortgage to pay for this functionality.
Admittedly, MongoDB does not offer master/master replication , in which two
separate servers can both accept write requests. However, it does have sharding , which
allows data to split across multiple machines, with each machine responsible for
updating different parts of the dataset. The benefit of this design is that, while some
solutions allow two master databases, MongoDB can potentially scale to hundreds of
machines as easily as it can run on two.
Opting for Performance vs. Features
Performance is important, but MongoDB also provides a large feature set. We've
already discussed some of the features MongoDB doesn't implement, and you might
be somewhat skeptical of the claim that MongoDB achieves its impressive performance
partly by judiciously excising certain features common to other databases. However,
there are analogous database systems available that are extremely fast, but also extremely
limited, such as those that implement a key/value store.
A perfect example is memcached . This application was written to provide high-speed
data caching, and it is mind-numbingly fast. When used to cache website content, it can
speed up an application many times over. This application is used by extremely large
websites, such as Facebook and LiveJournal.
 
Search WWH ::




Custom Search