Databases Reference
In-Depth Information
database to do an insert operation on a certain collection. By using batch insert, the
database doesn't need to reprocess this information for each document.
Batch inserts are intended to be used in applications, such as for inserting a couple
hundred sensor data points into an analytics collection at once. They are useful only if
you are inserting multiple documents into a single collection: you cannot use batch
inserts to insert into multiple collections with a single request. If you are just importing
raw data (for example, from a data feed or MySQL), there are command-line tools like
mongoimport that can be used instead of batch insert. On the other hand, it is often
handy to munge data before saving it to MongoDB (converting dates to the date type
or adding a custom "_id" ) so batch inserts can be used for importing data, as well.
Current versions of MongoDB do not accept messages longer than 16MB, so there is a
limit to how much can be inserted in a single batch insert.
Inserts: Internals and Implications
When you perform an insert, the driver you are using converts the data structure into
BSON, which it then sends to the database (see Appendix C for more on BSON). The
database understands BSON and checks for an "_id" key and that the document's size
does not exceed 4MB, but other than that, it doesn't do data validation; it just saves
the document to the database as is. This has a couple of side effects, most notably that
you can insert invalid data and that your database is fairly secure from injection attacks.
All of the drivers for major languages (and most of the minor ones, too) check for a
variety of invalid data (documents that are too large, contain non-UTF-8 strings, or use
unrecognized types) before sending anything to the database. If you are running a driver
that you are not sure about, you can start the database server with the --objcheck
option, and it will examine each document's structural validity before inserting it (at
the cost of slower performance).
Documents larger than 4MB (when converted to BSON) cannot be
saved to the database. This is a somewhat arbitrary limit (and may be
raised in the future); it is mostly to prevent bad schema design and en-
sure consistent performance. To see the BSON size (in bytes) of the
document doc , run Object.bsonsize( doc ) from the shell.
To give you an idea of how much 4MB is, the entire text of War and
Peace is just 3.14MB.
MongoDB does not do any sort of code execution on inserts, so they are not vulnerable
to injection attacks. Traditional injection attacks are impossible with MongoDB, and
alternative injection-type attacks are easy to guard against in general, but inserts are
particularly invulnerable.
 
Search WWH ::




Custom Search