Database Administration - The Definitive Guide to MongoDB

Database Reference

In-Depth Information

Backing Up Large Databases

Creating effective backup solutions can become a problem when working with large database systems. Often the time

taken to make a copy of the database is significant; it may even require hours to complete. During that time, you have

to maintain the database in a consistent state, so the backup does not contain files that were copied at different points

in time. The holy grail of a database backup system is a point-in-time snapshot, which can be done very quickly. The

faster the snapshot can be done, the smaller the window of time during which the database server must be frozen .

Using a Hidden Secondary Server for Backups

One technique used to perform large backups is to make the backup from a hidden secondary that can be frozen

while the backup is taken. This secondary server is then restarted to catch up with the application after the backup

is complete.

MongoDB makes it very simple to set up a hidden secondary and have it track the primary server using

MongoDB's replication mechanism. It's also relatively easy to configure (see Chapter 11 for more details on how to set

up a hidden secondary).

Creating Snapshots with a Journaling Filesystem

Many modern volume managers have the ability to create snapshots of the state of the drive at any particular point

in time. Using a filesystem snapshot is one of the fastest and most efficient methods of creating a backup of your

MongoDB instance. While setting up one of these systems is beyond the scope of this topic, we can show you how to

place the MongoDB server in a state where all of its data is in a consistent state on the disk. We also show you how to

block writes so that further changes are not written to the disk, but are instead buffered in memory.

A snapshot allows you to read the drive exactly as it was when the snapshot was taken. A system's volume or

filesystem manager makes sure that any blocks of data on the disk that are changed after the snapshot is taken are not

written back to the same place on the drive; this preserves all the data on the disk to be read. Generally, the procedure

for using a snapshot goes something like this:

1.

Create a snapshot.

2.

Copy data from the snapshot or restore the snapshot to another volume, depending on

your volume manager.

3.

Release the snapshot; doing so releases all preserved disk blocks that are no longer needed

back into the free space chain on the drive.

4.

Back up the data from the copied data while the server is still running.

The great thing about the method just described is that reads against the data can continue unhindered while the

snapshot is taken.

Some volume managers that have this capability include:

•

Linux and the LVM volume management system

•

Sun ZFS

•

Amazon EBS volumes

•

Windows Server using shadow copies

Most of those volume managers have the ability to perform a snapshot in a very short time—often just a few

seconds—even on very large amounts of data. The volume managers don't actually copy the data out at this point;

instead, they effectively insert a bookmark onto the drive, so that you can read the drive in the state it existed at the

point in time the snapshot was taken.

The Definitive Guide to MongoDB

Search WWH ::

Custom Search

Home