Database Reference
In-Depth Information
Repair is an offline operation. While it's running, the database will be locked against
reads and writes. The repair process works by reading and rewriting all data files, dis-
carding any corrupted documents in the process. It also rebuilds each index. This
means that to repair a database, you need enough free disk space to store the rewrite
of its data. To say repairs are expensive is an understatement, as repairing a very large
database can take days.
MongoDB's repair was originally used as a kind of last-ditch effort for recovering a
corrupted database. In the event of an unclean shutdown, without journaling
enabled, a repair is the only way to return the data files to a consistent state. Fortu-
nately, if you deploy with replication, run at least one server with journaling enabled,
and perform regular off-site backups, you should never have to recover by running a
repair. Relying on repair for recovery is foolish. Avoid it.
What then might a database repair be good for? Running a repair will compact the
data files and rebuild the indexes. As of the v2.0 release, MongoDB doesn't have great
support for data file compaction. So if you perform lots of random deletes, and espe-
cially if you're deleting small documents (< 4 KB ), it's possible for total storage size to
remain constant or grow despite these regularly occurring deletes. Compacting the
data files is a good remedy for this excess use of space.
If you don't have the time or resources to run a complete repair, there are two
options, both of which operate on a single collection. You can either rebuild indexes
or compact the collection. To rebuild indexes, use the reIndex() method:
> use cloud-docs
> db.spreadsheets.reIndex()
This might be useful, but generally speaking, index space is efficiently reused; the
data file space is what can be a problem. So the compact command is usually a better
choice. compact will rewrite the data files and rebuild all indexes for one collection.
Here's how you run it from the shell:
> db.runCommand({ compact: "spreadsheets" })
This command has been designed to be run on a live secondary, obviating the need for
downtime. Once you've finished compacting all the secondaries in a replica set, you
can step down the primary and then compact that node. If you must run the compact
command on the primary, you can do so by adding {force: true} to the command
key. Note that if you go this route, the command will write lock the system:
> db.runCommand({ compact: "spreadsheets", force: true })
10.3.3
Upgrading
MongoDB is still a relatively young project, which means that new releases generally
contain lots of important bug fixes and performance improvements. For this reason,
you should try to run the latest stable version of the software when possible. Upgrad-
ing, at least until v2.0, has been a simple process of shutting down the old mongod pro-
cess and starting the new one with the old data files. Subsequent versions of MongoDB
Search WWH ::




Custom Search