Database Reference
In-Depth Information
￿
Indexes can only use fields in the order they were created. Say, for example,
we create the index { "timestamp" :1, "retweet_count" :1, "keywords"
:1} .
This query is valid for queries structured in the following order:
-
timestamp, retweet_count, keywords
-
timestamp
-
timestamp, retweet_count
This query is not valid for queries structured in the following order:
- retweet_count, timestamp, keywords
-keywords
-
timestamp, keywords
￿
Indexes can contain, at most, one array. Twitter provides Tweet metadata in
the form of arrays, but we can only use one in any given index.
3.8
Extracting Documents: Retrieving All Documents
in a Collection
The simplest query we can provide to MongoDB is to return all of the data in a
collection. We use MongoDB's find function to do this, an example of which is
shown in Listing 3.3 .
3.9
Filtering Documents: Number of Tweets Generated
in a Certain Hour
Suppose we want to know the number of Tweets in our dataset from a particular
hour. To do this we will have to filter our data by the timestamp field with
“operators”: special values that act as functions in retrieving data.
Listing 3.4 shows how we can drill down to extract data only from this hour.
We use the $gt (“greater than”), and $lte (“less than or equal to”) operators to
pull dates from this time range. Notice that there is no explicit “AND” or “OR”
operator specified. MongoDB treats all co-occurring key/value pairs as “AND”s
unless explicitly specified by the $or operator. 5 Finally, the result of this query
is passed to the count function, which returns the number of documents returned
by the find function.
5 For more operators, see http://docs.mongodb.org/manual/reference/operator/.
Search WWH ::




Custom Search