Database Reference
In-Depth Information
Chapter 5
GridFS
We live in a world of high-definition video, 12MP cameras, and storage media that can hold 50GB of data on a disc
the size of a CD-ROM. In that context, the 16MB limit for the maximum size of a MongoDB document might seem
laughably inadequate. Indeed, you might wonder why MongoDB, which has been designed as a database for today's
high-tech age, has such a seemingly strange limitation. The short answer is performance.
If data were stored in the document itself, it would obviously get very large, which in turn would make the data
harder to work with. For example, pulling back the whole document would require loading the files in the document,
as well. You could work around this issue, but you would still need to pull back the entire file whenever you accessed
it, even if you only wanted a small section of it. You can't ask for a chunk of data in the middle of a document—it's an
all-or-nothing proposition. Fortunately, MongoDB features a unique and somewhat elegant solution to this problem.
MongoDB enables you to store large files quite easily, yet it also allows you to access parts of the file without retrieving
the entire thing—all while maintaining high performance. It achieves this by leveraging a specification known as
GridFS.
One interesting thing about GridFS is that it isn't actually a software feature. For example, there isn't any
special server-side code in MongoDB that manages GridFS. Instead, GridFS is a simple specification used by all of the
supported drivers on MongoDB. The key benefit of such a specification is that files stored by one driver can be accessed
by any other driver that follows the same convention.
Note
This approach adheres closely to the MongoDB principle of keeping things simple. Because GridFS uses standard
MongoDB features, it's easy to implement and work with the specification from the driver's point of view. It also
means you can poke around by hand if you really want to, as to MongoDB files in the GridFS specification are just
normal collections containing documents.
Filling in Some Background
Chapter 1 touched on the fact that we have been taught to use databases for even simple storage for many years. For
example, the topic one of us bought to help improve his PHP more than 15 years ago introduced MySQL in Chapter 3.
Considering the complexity of SQL and databases in the real world (not to mention in theory), you might wonder why
a book intended for beginners would practically start off with SQL. After all, it was a PHP book and not a MySQL book.
One thing most people don't appreciate until they try it is that reading and writing data directly to disk is hard.
Some people don't agree with us on this point—after all, opening and reading files in Python might seem trivial. And
it is: in simpler scenarios, working with files is rather painless when using PHP. If all you want to do is read in lines and
process them, you're unlikely to have any trouble.
 
 
Search WWH ::




Custom Search