Database Reference
In-Depth Information
5.0. You can cross-reference these IDs with the data file provided as part of
the GroupLens data set download.
Running an Item-to-item Recommendation Job
In the previous example, we used the GroupLens data set to generate
recommendations by calculating similarity between users. In this
demonstration,weinsteadusethenotionofitemsimilaritytodetermineour
item recommendations.
For this exercise, you will reuse the GroupLens data set as the format and
data requirements for the item-to-item RecommendationJob are the same.
In fact, a significant amount of overlap exists between the two jobs,
including the job parameters.
In the user-to-user example, the Mahout library uses a similarity metric to
form neighborhoods or clusters and then makes recommendations based on
reviews by statistically similar users. The item-to-item recommender takes
a different approach, instead focusing on items (or in our case, movies).
Much like the former example, the item-to-item recommender must
calculate the similarity between movies. To accomplish this, the
recommender uses both user reviews and the co-occurrence of movie
reviews by users to determine this similarity score. Using this notion of
similarity, the job can then generate recommendations based on the
provided input.
To generate item-based recommendations, follow these steps:
1. Open the Hadoop command-line console.
2. Mahout uses temporary storage for intermediate files that are output
out of intermediate MapReduce jobs. Before you can run a new Mahout
job, you need to purge the temporary directory. Use the following
command to delete the files in the temporary directory:
hadoop fs -rmr -skipTrash /user/<USER>/temp
3. Enter the Mahout item recommender job to kick off the item-based
RecommenderJob:
hadoop jar c:\mahout\mahout-core-0.7-job.jar
Search WWH ::




Custom Search