Database Reference
In-Depth Information
NOTE
This case assumes that you're using a standard uncapped collection for this event data, un-
less otherwise noted. See Capped collections for another approach to aging out old data.
Schema Design
The schema for storing log data in MongoDB depends on the format of the event data that
you're storing. For a simple example, you might consider standard request logs in the com-
bined format from the Apache HTTP Server. A line from these logs may resemble the follow-
ing:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" ...
The simplest approach to storing the log data would be putting the exact text of the log record
into a document:
{
_id : ObjectId (...),
line : ' 127.0 . 0.1 - frank [ 10 / Oct / 2000 : 13 : 55 : 36 - 0700 ] " GET / apache_pb . gif ...
}
Although this solution does capture all data in a format that MongoDB can use, the data is
neither particularly useful nor efficient. For example, if you need to find events on the same
page, you would need to use a regular expression query, which would require a full scan of
the collection. A better approach is to extract the relevant information from the log data into
individual fields in a MongoDB document .
Whendesigningthestructureofthatdocument,it'simportant topayattention tothedatatypes
available for use in BSON, the MongoDB document format. Choosing your data types wisely
can have a significant impact on the performance and capability of the logging system. For
instance, consider the date field. In the previous example, [10/Oct/2000:13:55:36 -0700]
is 28 bytes long. If you store this with the UTC timestamp BSON type, you can convey the
same information in only 8 bytes.
Additionally, using proper types for your data also increases query flexibility. If you store
date as a timestamp, you can make date range queries, whereas it's very difficult to compare
Search WWH ::




Custom Search