Database Reference
In-Depth Information
monthly_zeros = [
( 'daily. %d ' % d , 0 ) for
for d iin range ( 1 , 32 ) ]
# Perform upserts, setting metadata
db . stats . daily . update (
{
'_id' : id_daily ,
'metadata' : daily_metadata
},
{ '$inc' : dict ( daily_zeros ) },
upsert = True )
db . stats . monthly . update (
{
'_id' : id_monthly ,
'daily' : daily },
{ '$inc' : dict ( monthly_zeros ) },
upsert = True )
This function pre-allocates both the monthly and daily documents at the same time. The per-
formance benefits from separating these operations are negligible, so it's reasonable to keep
both operations in the same function.
The question now arises as to when to pre-allocate the documents. Obviously, for best per-
formance, they need to be pre-allocated before they are used (although the upsert code will
actually work correctly even if it executes against a document that already exists). While we
could pre-allocate the documents all at once, this leads to poor performance during the pre-
allocation time. A better solution is to pre-allocate the documents probabilistically each time
we log a hit:
from
from random
random import
import random
from
from datetime
datetime import
import datetime , timedelta , time
# Example probability based on 500k hits per day per page
prob_preallocate = 1.0 / 500000
def
def log_hit ( db , dt_utc , site , page ):
iif random . random () < prob_preallocate :
preallocate ( db , dt_utc + timedelta ( days = 1 ), site_page )
# Update daily stats doc
...
Search WWH ::




Custom Search