Database Reference
In-Depth Information
Rating(789,134,5.278933936827717)
Rating(789,156,5.250959077906759)
Rating(789,432,5.169863417126231)
Inspecting the recommendations
We can give these recommendations a sense check by taking a quick look at the titles of
the movies a user has rated and the recommended movies. First, we need to load the
movie data (which is the one of the datasets we explored in the previous chapter). We'll
collect this data as a Map[Int, String] method mapping the movie ID to the title:
val movies = sc.textFile("/PATH/ml-100k/u.item")
val titles = movies.map(line =>
line.split("\\|").take(2)).map(array =>
(array(0).toInt,array(1))).collectAsMap()
titles(123)
The preceding code will produce the output as:
res68: String = Frighteners, The (1996)
For our user 789 , we can find out what movies they have rated, take the 10 movies with
the highest rating, and then check the titles. We will do this now by first using the keyBy
Spark function to create an RDD of key-value pairs from our ratings RDD, where the
key will be the user ID. We will then use the lookup function to return just the ratings
for this key (that is, that particular user ID) to the driver:
val moviesForUser = ratings.keyBy(_.user).lookup(789)
Let's see how many movies this user has rated. This will be the size of the
moviesForUser collection:
println(moviesForUser.size)
We will see that this user has rated 33 movies.
Next, we will take the 10 movies with the highest ratings by sorting the
moviesForUser collection using the rating field of the Rating object. We will
then extract the movie title for the relevant product ID attached to the Rating class from
our mapping of movie titles and print out the top 10 titles with their ratings:
Search WWH ::




Custom Search