Database Reference
In-Depth Information
• Spearman correlation
• Cosine
• Tanimoto coefficient
• Log-likelihood
• Pearson correlation
After, the job is started, it will take between 15 and 20 minutes to run if you
are using a four-node cluster. During this time, a series of MapReduce jobs
are being run to process the data and generate the movie recommendations.
When the job has completed, you can view the various outputted files using
the following command:
hadoop fs -ls /user/<YOUR USERNAME>/chapter15/output/
userrecommendations
You can find the generated recommendations in the part-r-00000 file.
To export the file from HDFS to your local file system, use the following
command:
hadoop fs -copyToLocal
/user/<YOUR USERNAME>/chapter15/output/
userrecommendations/part-r-00000
c:\<LOCAL OUPUT DIRECTORY>\recommendations.csv
You can review the file to find the recommendation generated for each user.
The output from the recommendation job takes the following format:
UserID [ItemID:Estimate Rating, ………]
An example of the output is shown here:
1
[1566:5.0,1036:5.0,1033:5.0,1032:5.0,1031:5.0,3107:5.0]
In this example, for the user identified by the ID of 1, we would recommend
the movies identified by the IDs 1566 ( The Man from Down Under ), 1036
( Drop Dead Fred) , 1033 ( Homeward Bound II: Lost in San Francisco ), and
so on. The estimated ratings for each of these movies for this specific user is
Search WWH ::




Custom Search