Writing programs using MongoDB - MongoDB in Action

Database Reference

In-Depth Information

Every tweet will have an id field (distinct from MongoDB's _id field) which references

the tweet's internal Twitter ID . You're creating a unique index on this field to keep

from inserting the same tweet twice.

You're also creating a compound index on tags ascending and id descending.

Indexes can be specified in ascending or descending order. This matters mainly when

creating compound indexes; you should always choose the directions based on your

expected query patterns. Since you're going to want to query for a particular tag and

show the results from newest to oldest, an index with tags ascending and ID descending

will make that query use the index both for filtering results and for sorting them. As you

can see here, you indicate index direction with 1 for ascending and -1 for descending .

3.3.2

Gathering data

MongoDB allows you to insert data regardless of its structure. Since you don't need to

know which fields you'll be given in advance, Twitter is free to modify its API 's return

values with practically no consequences to your application. Normally, using an

RDBMS , any change to Twitter's API (or more generally, to your data source) will

require a database schema migration. With MongoDB, your application might need to

change to accommodate new data schemas, but the database itself can handle any

document-style schema automatically.

The Ruby Twitter library returns Ruby hashes, so you can pass these directly to

your MongoDB collection object. Within your TweetArchiver , you add the following

instance method:

def save_tweets_for(term)

Twitter::Search.new.containing(term).each do |tweet|

@tweets_found += 1

tweet_with_tag = tweet.to_hash.merge!({"tags" => [term]})

@tweets.save(tweet_with_tag)

end

Before saving each tweet document, you make one small modification. To simplify later

queries, you add the search term to a tags attribute. Then you pass the modified doc-

ument to the save method. Here, then, is the complete listing for the archiver class.

Listing 3.1

A class for fetching tweets and archiving them in MongoDB

require 'rubygems'

require 'mongo'

require 'twitter'

require 'config'

class TweetArchiver

# Create a new instance of TweetArchiver

def initialize(tag)

connection = Mongo::Connection.new

db

= connection[DATABASE_NAME]

@tweets

= db[COLLECTION_NAME]

MongoDB in Action

Search WWH ::

Custom Search

Home