Database Reference
In-Depth Information
Every tweet will have an id field (distinct from MongoDB's _id field) which references
the tweet's internal Twitter ID . You're creating a unique index on this field to keep
from inserting the same tweet twice.
You're also creating a compound index on tags ascending and id descending.
Indexes can be specified in ascending or descending order. This matters mainly when
creating compound indexes; you should always choose the directions based on your
expected query patterns. Since you're going to want to query for a particular tag and
show the results from newest to oldest, an index with tags ascending and ID descending
will make that query use the index both for filtering results and for sorting them. As you
can see here, you indicate index direction with 1 for ascending and -1 for descending .
3.3.2
Gathering data
MongoDB allows you to insert data regardless of its structure. Since you don't need to
know which fields you'll be given in advance, Twitter is free to modify its API 's return
values with practically no consequences to your application. Normally, using an
RDBMS , any change to Twitter's API (or more generally, to your data source) will
require a database schema migration. With MongoDB, your application might need to
change to accommodate new data schemas, but the database itself can handle any
document-style schema automatically.
The Ruby Twitter library returns Ruby hashes, so you can pass these directly to
your MongoDB collection object. Within your TweetArchiver , you add the following
instance method:
def save_tweets_for(term)
Twitter::Search.new.containing(term).each do |tweet|
@tweets_found += 1
tweet_with_tag = tweet.to_hash.merge!({"tags" => [term]})
@tweets.save(tweet_with_tag)
end
end
Before saving each tweet document, you make one small modification. To simplify later
queries, you add the search term to a tags attribute. Then you pass the modified doc-
ument to the save method. Here, then, is the complete listing for the archiver class.
Listing 3.1
A class for fetching tweets and archiving them in MongoDB
require 'rubygems'
require 'mongo'
require 'twitter'
require 'config'
class TweetArchiver
# Create a new instance of TweetArchiver
def initialize(tag)
connection = Mongo::Connection.new
db
= connection[DATABASE_NAME]
@tweets
= db[COLLECTION_NAME]
 
Search WWH ::




Custom Search