Database Reference
In-Depth Information
Bulk inserts
If possible, you should use bulk inserts to insert event data. All write concern options apply to
bulk inserts, but you can pass multiple events to the
insert()
method at once. Batch inserts
allow MongoDB to distribute the performance penalty incurred by more stringent write con-
cern across a group of inserts.
If you're doing a bulk insert and
do
get an error (either a network interruption or a unique
key violation), your application will need to handle the possibility of a partial bulk insert. If
your particular use case doesn't care about missing a few inserts, you can add the
contin-
ue_on_error=True
argumenttoinsert,inwhichcasetheinsertwillinsertasmanydocuments
as possible, and report an error on the
last
insert that failed.
If you use
continue_on_error=True
and
multiple
inserts in your batch fail, your application
will only receive information on the
last
insert to fail. The take-away? You can sometimes
amortize the overhead of safer writes by using bulk inserts, but this technique brings with it
another set of concerns as well.
Finding all events for a particular page
The value in maintaining a collection of event data derives from being able to query that data
to answer specific questions. You may have a number of simple queries that you may use to
analyze these data.
As an example, you may want to return all of the events associated with a specific value of
a field. Extending the Apache access log example, a common case would be to query for all
events with a specific value in the
path
field. This section contains a pattern for returning data
and optimizing this operation.
In this case, you'd use a query that resembles the following to return all documents with the
/apache_pb.gif
value in the
path
field:
>>>
>>>
q_events
=
db
.
events
.
find
({
'path'
:
'/apache_pb.gif'
})
Of course, if you want this query to perform well, you'll need to add an index on the
path
field:
>>>
>>>
db
.
events
.
ensure_index
(
'path'
)