Databases Reference
In-Depth Information
Hopefully this tip will become irrelevant soon, once the bug for
online
compaction is fixed
.
repair
basically does a
mongodump
and then a
mongorestore
, making a clean copy of your
data and, in the process, removing any empty “holes” in your data files. (When you do
a lot of deletes or updates that move things around, large parts of your collection could
be sitting around empty.)
repair
re-inserts everything in a compact form.
Remember the caveats to using
repair
:
• It will block operations, so you don't want to run it on a master. Instead, run it on
each secondary first, then finally step down the primary and run it on that server.
• It will take twice the disk space your database is currently using (e.g., if you have
200GB of data, your disk must have at least 200GB of
free space
to run
repair
).
One problem a lot of people have is that they have too much data to run
repair
: they
might have a 500GB database on a server with 700GB of disk. If you're in this situation,
you can do a “manual” repair by doing a
mongodump
and then a
mongorestore
.
For example, suppose we have a server that's filling up with mostly empty space at
ny1
. The database is 300GB and the server it's on only has a 400GB disk. However, we
also have
ny2
, which is an identical 400GB machine with nothing on it yet. First, we
step down
ny1
, if it is master, and
fsync
and lock it so that there's a consistent view of
its data on disk:
> rs.stepDown()
> db.runCommand({fsync : 1, lock : 1})
We can log into
ny2
and run:
ny2$ mongodump --host ny1
This will dump the database to a directory called
dump
on
ny2
.
mongodump
will probably be constrained by network speed in the above operation. If you
have physical access to the machine, plug in an external hard drive and do a local
mongodump
to that.
Once you have a dump you have to restore it to
ny1
:
1. Shut down the
mongod
running on
ny1
.
2. Back up the data files on
ny1
(e.g., take an EBS snapshot), just in case.
3. Delete the data files on
ny1
.
4. Restart the (now empty)
ny1
. If it was part of a replica set, start it up on a different
port and without
--replSet
, so that it (and the rest of the set) doesn't get confused.
Finally, run
mongorestore
from
ny2
: