Database Reference
In-Depth Information
It's possible to run the server without journaling as a way of increasing performance
for some write loads. The downside is that the data files may be corrupted after an
unclean shutdown. As a consequence, anyone planning to disable journaling must
run with replication, preferably to a second data center, to increase the likelihood that
a pristine copy of the data will still exist even if there's a failure.
The topics of replication and durability are vast; you'll see a detailed exploration of
them in chapter 8.
1.2.6
Scaling
The easiest way to scale most databases is to upgrade the hardware. If your application
is running on a single node, it's usually possible to add some combination of disk
IOPS , memory, and CPU to ease any database bottlenecks. The technique of augment-
ing a single node's hardware for scale is known as vertical scaling or scaling up . Vertical
scaling has the advantages of being simple, reliable, and cost-effective up to a certain
point. If you're running on virtualized hardware (such as Amazon's EC2 ), then you
may find that a sufficiently large instance isn't available. If you're running on physical
hardware, there may come a point where the cost of a more powerful server becomes
prohibitive.
It then makes sense to consider scaling horizontally , or scaling out . Instead of beefing
up a single node, scaling horizontally means distributing the database across multiple
machines. Because a horizontally scaled architecture can use commodity hardware,
the costs for hosting the total data set can be significantly reduced. What's more, the
the distribution of data across machines mitigates the consequences of failure.
Machines will unavoidably fail from time to time. If you've scaled vertically, and the
machine fails, then you need to deal with the failure of a machine upon which most of
your system depends. This may not be an issue if a copy of the data exists on a repli-
cated slave, but it's still the case that only a single server need fail to bring down the
entire system. Contrast that with failure inside a horizontally scaled architecture. This
may be less catastrophic since a single machine represents a much smaller percentage
of the system as a whole.
MongoDB has been designed to make horizontal scaling manageable. It does so
via a range-based partitioning mechanism, known as auto-sharding , which automati-
cally manages the distribution of data across nodes. The sharding system handles the
addition of shard nodes, and it also facilitates automatic failover. Individual shards are
made up of a replica set consisting of at least two nodes, 4 ensuring automatic recovery
with no single point of failure. All this means that no application code has to handle
these logistics; your application code communicates with a sharded cluster just as it
speaks to a single node.
We've covered a lot of MongoDB's most compelling features; in chapter 2, we'll
begin to see how some of these work in practice. But at this point, we're going to take
4
Technically, each replica set will have at least three nodes, but only two of these need carry a copy of the data.
Search WWH ::




Custom Search