Database Reference
In-Depth Information
One document per page per day, flat documents
Consider the following example schema for a solution that stores all statistics for a single day
and page in a single document:
{
_id
:
"20101010/site-1/apache_pb.gif"
,
metadata
:
{
date
:
ISODate
(
"2000-10-10T00:00:00Z"
),
site
:
"site-1"
,
page
:
"/apache_pb.gif"
},
daily
:
5468426
,
hourly
:
{
"0"
:
227850
,
"1"
:
210231
,
...
"23"
:
20457
},
minute
:
{
"0"
:
3612
,
"1"
:
3241
,
...
"1439"
:
2819
}
}
This approach has a couple of advantages:
▪ For every request on the website, you only need to update one document.
▪ Reports for time periods within the day, for a single page, require fetching a single docu-
ment.
If we use this schema, our real-time analytics system might record a hit with the following
code:
def
def
record_hit
(
collection
,
id
,
metadata
,
hour
,
minute
):
collection
.
update
(
{
'_id'
:
id
,
'metadata'
:
metadata
},
{
'$inc'
: {
'daily'
:
1
,
'hourly.
%d
'
%
hour
:
1
,
'minute.
%d
'
%
minute
:
1
} },
upsert
=
True
)