Database Reference
In-Depth Information
> use images
> db.thumbnails.findOne({}, {data: 0})
{
"_id" : ObjectId("4d608614238d3b4ade000001"),
"md5" : BinData(5,"K1ud3EUjT49wdMdkOGjbDg=="),
"name" : "monument-thumb.jpg"
}
See that the MD5 field is clearly marked as binary data, with the subtype and raw payload.
C.2
GridFS
GridFS is a convention for storing files of arbitrary size in MongoDB. The GridFS spec-
ification is implemented by all of the official drivers and by MongoDB's mongofiles
tool, ensuring consistent access across platforms. GridFS is useful for storing large
binary objects in the database. It's frequently fast enough to serve these object as well,
and the storage method is conducive to streaming.
The term GridFS frequently leads to confusion, so two clarifications are worth mak-
ing right off the bat. The first is that GridFS isn't an intrinsic feature of MongoDB. As
mentioned, it's a convention that all the official drivers (and some tools) use to manage
large binary objects in the database. Second, it's important to clarify that GridFS
doesn't have the rich semantics of bona fide file systems. For instance, there's no pro-
tocol for locking and concurrency, and this limits the GridFS interface to simple put,
get, and delete operations. This means that if you want to update a file, you need to
delete it and then put the new version.
GridFS works by dividing a large file into small, 256 KB chunks and then storing
each chunk as a separate document. By default, these chunks are stored in a collec-
tion called fs.chunks . Once the chunks are written, the file's metadata is stored in a
single document in another collection called fs.files . Figure C.1 contains a simplis-
tic illustration of this process applied to a theoretical 1 MB file called canyon.jpg .
That should be enough theory to use GridFS. Next we'll see GridFS in practice
through the Ruby GridFS API and the mongofiles utility.
C.2.1
GridFS in Ruby
Earlier you stored a small image thumbnail. The thumbnail took up only 10 KB and
was thus ideal for keeping in a single document. The original image is almost 2 MB
in size, and is therefore much more appropriate for GridFS storage. Here you'll
store the original using Ruby's GridFS API . First, you connect to the database and
then initialize a Grid object, which takes a reference to the database where the
GridFS file will be stored.
Next, you open the original image file, canyon.jpg , for reading. The most basic
GridFS interface uses methods to put and get a file. Here you use the Grid#put
method, which takes either a string of binary data or an IO object, such as a file
pointer. You pass in the file pointer and the data is written to the database.
Search WWH ::




Custom Search