Database Reference
In-Depth Information
Backing Up Large Databases
Creating effective backup solutions can become a problem when working with large database systems. Often the time
taken to make a copy of the database is significant; it may even require hours to complete. During that time, you have
to maintain the database in a consistent state, so the backup does not contain files that were copied at different points
in time. The holy grail of a database backup system is a point-in-time snapshot, which can be done very quickly. The
faster the snapshot can be done, the smaller the window of time during which the database server must be frozen .
Using a Hidden Secondary Server for Backups
One technique used to perform large backups is to make the backup from a hidden secondary that can be frozen
while the backup is taken. This secondary server is then restarted to catch up with the application after the backup
is complete.
MongoDB makes it very simple to set up a hidden secondary and have it track the primary server using
MongoDB's replication mechanism. It's also relatively easy to configure (see Chapter 11 for more details on how to set
up a hidden secondary).
Creating Snapshots with a Journaling Filesystem
Many modern volume managers have the ability to create snapshots of the state of the drive at any particular point
in time. Using a filesystem snapshot is one of the fastest and most efficient methods of creating a backup of your
MongoDB instance. While setting up one of these systems is beyond the scope of this topic, we can show you how to
place the MongoDB server in a state where all of its data is in a consistent state on the disk. We also show you how to
block writes so that further changes are not written to the disk, but are instead buffered in memory.
A snapshot allows you to read the drive exactly as it was when the snapshot was taken. A system's volume or
filesystem manager makes sure that any blocks of data on the disk that are changed after the snapshot is taken are not
written back to the same place on the drive; this preserves all the data on the disk to be read. Generally, the procedure
for using a snapshot goes something like this:
1.
Create a snapshot.
2.
Copy data from the snapshot or restore the snapshot to another volume, depending on
your volume manager.
3.
Release the snapshot; doing so releases all preserved disk blocks that are no longer needed
back into the free space chain on the drive.
4.
Back up the data from the copied data while the server is still running.
The great thing about the method just described is that reads against the data can continue unhindered while the
snapshot is taken.
Some volume managers that have this capability include:
Linux and the LVM volume management system
Sun ZFS
Amazon EBS volumes
Windows Server using shadow copies
Most of those volume managers have the ability to perform a snapshot in a very short time—often just a few
seconds—even on very large amounts of data. The volume managers don't actually copy the data out at this point;
instead, they effectively insert a bookmark onto the drive, so that you can read the drive in the state it existed at the
point in time the snapshot was taken.
 
Search WWH ::




Custom Search