GridFS - MongoDB Basics

Database Reference

In-Depth Information

Filling in Some Background

Chapter 1 touched on the fact that we have been taught to use databases for even simple

storage for many years. For example, the topic one of us bought to help improve his PHP

more than 15 years ago introduced MySQL in Chapter 3. Considering the complexity of

SQL and databases in the real world (not to mention in theory), you might wonder why

a book intended for beginners would practically start off with SQL. After all, it was a PHP

book and not a MySQL book.

One thing most people don't appreciate until they try it is that reading and writing

data directly to disk is hard. Some people don't agree with us on this point—after all,

opening and reading files in Python might seem trivial. And it is: in simpler scenarios,

working with files is rather painless when using PHP. If all you want to do is read in lines

and process them, you're unlikely to have any trouble.

On the other hand, things become a lot harder if you want to search a file or store

complicated or structured data. Even if you can work out how to do this and create a

solution, your solution is unlikely to be faster or more efficient than relying on a database

instead. Today's applications depend on finding and storing data quickly—and databases

make this possible for those of us who can't or don't want to write such a system

ourselves.

One area that is glossed over by many topics is the storing of files. Most topics that

teach you to use a database to store your data also teach you to read and write to the

filesystem instead when you need to store files. In some ways, this isn't usually a problem,

because it's much easier to read and write simple files than to process what's in them.

There are some issues, however. First, the developer must have permission to write

those files in the first place, and that requires giving the web server permission to write

to the local filesystem. This might not seem likely to pose a problem, but it gives system

administrators nightmares—getting files onto a server is the first stage in being able to

compromise it.

Databases can store binary files; typically, it's just not elegant for them to do so.

MySQL has a special column type called BLOB . PostgreSQL requires special procedures

to be followed to store such files—and the data isn't stored in the table itself. In other

words, it's messy. These solutions are obviously bolt-ons. Thus, it's not surprising that

people choose to write data to the disk instead. But that approach also has issues. Apart

from the problems with security, it adds another directory that needs to be backed up,

and you must also ensure that this information is replicated to all the appropriate servers.

There are filesystems that provide the ability to write to disk and have that content fully

replicated (including GFS); but these solutions are complex and add overhead; moreover,

these features typically make your solution harder to maintain.

MongoDB, on the other hand, enforces a maximum document size of 16MB. This is

more than enough for storing rich documents, and it might have sufficed a few years ago

for storing many other types of files as well. However, this limit is wholly inadequate for

today's environment.

Search WWH ::

Custom Search

Home