Database Reference
In-Depth Information
2. Change directories to your bin folder: cd mongodb/bin .
3. Execute the following command: mongo localhost/tweetdata path/
to/example.js . This will run the example on your local MongoDB installa-
tion. If you are on windows, you will have to replace mongo with mongo.exe .
3.5
Adding Tweets to the Collection
Now that we have a collection in the database, we will add some Tweets to it.
Because MongoDB uses JSON to store its documents, we can import the data
exactly as it was collected from Twitter, with no need to map columns. To load
this, download the Occupy Wall Street data included in the supplementary materials,
ows.json . Next, with mongod running, issue the following command 3 :
mongoimport -d tweetdata -c tweets -file ows.json
mongoimport is a utility that is packaged with MongoDB that allows you to
import JSON documents. By running the above command, we have added all of the
JSON documents in the file to the collection we created earlier. We now have some
Tweets stored in our database, and we are ready to issue some commands to analyze
this data.
3.6
Optimizing Collections for Queries
To make our documents more accessible, we will extract some key features for
indexing later. For example, while the “created_at” field gives us information
about a date in a readable format, converting it to a JavaScript date each time we
do a date comparison will add overhead to our computations. It makes sense to
add a field “timestamp” whose value contains the Unix timestamp 4 representing
the information contained in “created_at”. This redundancy trades disk space
for efficient computation, which is more of a concern when building real-time
applications which rely on big data. Listing 3.1 is a post-processing script that adds
fields that make handling the Twitter data more convenient and efficient.
3 On Windows, you exchange mongoimport with mongoimport.exe.
4 A number, the count of milliseconds since January 1st, 1970.
Search WWH ::




Custom Search