Database Reference
In-Depth Information
On the other hand, things become a lot harder if you want to search a file or store complicated or structured data.
Even if you can work out how to do this and create a solution, your solution is unlikely to be faster or more efficient
than relying on a database instead. Today's applications depend on finding and storing data quickly—and databases
make this possible for those of us who can't or don't want to write such a system ourselves.
One area that is glossed over by many topics is the storing of files. Most topics that teach you to use a database to
store your data also teach you to read and write to the filesystem instead when you need to store files. In some ways, this
isn't usually a problem, because it's much easier to read and write simple files than to process what's in them. There are
some issues, however. First, the developer must have permission to write those files in the first place, and that requires
giving the web server permission to write to the local filesystem. This might not seem likely to pose a problem, but it
gives system administrators nightmares—getting files onto a server is the first stage in being able to compromise it.
Databases can store binary files; typically, it's just not elegant for them to do so. MySQL has a special column type
called BLOB . PostgreSQL requires special procedures to be followed to store such files—and the data isn't stored in
the table itself. In other words, it's messy. These solutions are obviously bolt-ons. Thus, it's not surprising that people
choose to write data to the disk instead. But that approach also has issues. Apart from the problems with security, it
adds another directory that needs to be backed up, and you must also ensure that this information is replicated to
all the appropriate servers. There are filesystems that provide the ability to write to disk and have that content fully
replicated (including GFS); but these solutions are complex and add overhead; moreover, these features typically
make your solution harder to maintain.
MongoDB, on the other hand, enforces a maximum document size of 16MB. This is more than enough for storing
rich documents, and it might have sufficed a few years ago for storing many other types of files as well. However, this
limit is wholly inadequate for today's environment.
Working with GridFS
Next, we'll take a brief look at how GridFS is implemented. As the MongoDB website points out, you do not need to
understand or be aware of the underlying implementation of GridFS to use it. In fact, you can simply let the driver
handle the heavy lifting for you. For the most part, the drivers that support GridFS implement file handling in a
language-specific way. For example, the MongoDB driver for Python works in a manner that is wholly consistent with
Python, as you'll see shortly. If the ins-and-outs of GridFS don't interest you, then just skip ahead to the next section.
We promise you won't miss anything that enables you to use MongoDB effectively!
GridFS consists of two parts. More specifically, it consists of two collections. One collection holds the filename
and related information such as size (called metadata), while the other collection holds the file data itself, usually
in 256K chunks. The specification calls for these to be named files and chunks , respectively. By default, the files
and chunks collections are created in the fs namespace, but this can be changed. The ability to change the default
namespace is useful if you want to store different types of files. For example, you might want to keep image and movie
files separate.
Getting Started with the Command-Line Tools
Now that we have some of the background out of the way, let's look at how to get started with GridFS by exploring the
command-line tools available to leverage it. First, we will need a file to play with. To keep things simple, let's use the
dictionary file. On Ubuntu, you can find this at /usr/share/dict/words . However, there are various levels of symbolic
links, so you might want to run this command first:
root@core2:/usr/share/dict# cat words > /tmp/dictionary
Note
In Ubuntu, you might need to use apt-get install wbritish to get the dictionary file installed.
 
 
Search WWH ::




Custom Search