Databases Reference
In-Depth Information
CHAPTER 5
Administration Tips
Tip #39: Manually clean up your chunks collections
GridFS keeps file contents in a collection of chunks, called
fs.chunks
by default. Each
document in the files collection points to one or more document in the chunks collec-
tion. It's good to check every once and a while and make sure that there are no “orphan”
chunks—chunks floating around with no link to a file. This could occur if the database
was shut down in the middle of saving a file (the
fs.files
document is written after the
chunks).
To check over your chunks collection, choose a time when there's little traffic (as you'll
be loading a lot of data into memory) and run something like:
> var cursor = db.fs.chunks.find({}, {"_id" : 1, "files_id" : 1});
> while (cursor.hasNext()) {
... var chunk = cursor.next();
... if (db.fs.files.findOne({_id : chunk.files_id}) == null) {
... print("orphaned chunk: " + chunk._id);
... }
This will print out the
_id
s for all orphaned chunks.
Now, before you go through and delete all of the orphaned chunks, make sure that
they are not parts of files that are currently being written! You should check
db.curren
tOp()
and the
fs.files
collection for recent
uploadDate
s.
Tip #40: Compact databases with repair
In
“Tip #31: Do not depend on repair to recover data” on page 35
, we cover why you
usually shouldn't use
repair
to actually repair your data (unless you're in dire straits).
However,
repair
can be used to compact databases.