Database Reference
In-Depth Information
5.0. You can cross-reference these IDs with the data file provided as part of
the GroupLens data set download.
Running an Item-to-item Recommendation Job
In the previous example, we used the GroupLens data set to generate
recommendations by calculating similarity between users. In this
item recommendations.
For this exercise, you will reuse the GroupLens data set as the format and
data requirements for the item-to-item RecommendationJob are the same.
In fact, a significant amount of overlap exists between the two jobs,
including the job parameters.
In the user-to-user example, the Mahout library uses a similarity metric to
form neighborhoods or clusters and then makes recommendations based on
reviews by statistically similar users. The item-to-item recommender takes
a different approach, instead focusing on items (or in our case, movies).
Much like the former example, the item-to-item recommender must
calculate the similarity between movies. To accomplish this, the
recommender uses both user reviews and the co-occurrence of movie
reviews by users to determine this similarity score. Using this notion of
similarity, the job can then generate recommendations based on the
provided input.
To generate item-based recommendations, follow these steps:
1. Open the Hadoop command-line console.
2. Mahout uses temporary storage for intermediate files that are output
out of intermediate MapReduce jobs. Before you can run a new Mahout
job, you need to purge the temporary directory. Use the following
command to delete the files in the temporary directory:
hadoop fs -rmr -skipTrash /user/<USER>/temp
3. Enter the Mahout item recommender job to kick off the item-based
hadoop jar c:\mahout\mahout-core-0.7-job.jar
Search WWH ::

Custom Search