Database Reference
In-Depth Information
Listing 3.7
MapReduce function that lists the most mentioned users
> / *
> * This function extracts each user mentioned,
> * and the count of each mention.
> * The function takes 0 parameters, as the document
> * will be passed through context (the 'this' object).
> * /
> var mapFunction = function(){
...
//loop through all of the mentions in the document.
...
var userMentions = this .entities.user_mentions;
...
for(var i = 0; i < userMentions.length; i++){
...
//check that the username is not blank.
...
if(userMentions[i].screen_name.length > 0){
...
//emit the username (key) and
...
//the count (value, in this case always 1).
...
emit(userMentions[i].screen_name, 1);
...
}
...
}
... }
> / *
> * This function sums the number of mentions of each user
> * /
> var reduceFunction = function(keyUsername, occurs){
...
return Array.sum(occurs);
... }
> // Perform the MapReduce operation, and store the results
> // in a new collection, "most_mentioned_users".
> db.tweets.mapReduce(mapFunction, reduceFunction, { "out" : "
most_mentioned_users" });
> // List the top 5 most-mentioned users
> db.most_mentioned_users.find().sort({ "value" : -1}).limit(5)
{ "_id" : "MikeBloomberg" , "value" : 727 }
{ "_id" : "OccupyWallSt" , "value" : 588 }
{ "_id" : "OccupyWallStNYC" , "value" : 428 }
{ "_id" : "JoshHarkinson" , "value" : 295 }
{ "_id" : "ydanis" , "value" : 260 }
Source: Chapter3/mapreduce.js
In Listing 3.7 , the MapReduce is constructed as follows. The map function, called
mapFunction , looks at each individual Tweet and pulls out the mentioned users.
It then constructs the key/value pair to be sent to the reducer. The key is the user
that was mentioned, and the value is 1. MongoDB then creates a unique reducer for
each unique key and calls the reduce function, reduceFunction , on each key.
The reducer then takes this list of values and calculates the sum. The result is a list
of mentioned users and the count of the number of mentions for that user.
 
Search WWH ::




Custom Search