Database Reference
In-Depth Information
You will see the output as follows:
14/04/13 21:02:01 INFO MemoryStore: ensureFreeSpace(672960)
called with curMem=4006896, maxMem=311387750
14/04/13 21:02:01 INFO MemoryStore: Block broadcast_21
stored as values to memory (estimated size 657.2 KB, free
292.5 MB)
imBroadcast:
org.apache.spark.broadcast.Broadcast[org.jblas.DoubleMatrix]
= Broadcast(21)
Now we are ready to compute the recommendations for each user. We will do this by ap-
plying a map function to each user factor within which we will perform a matrix multi-
plication between the user-factor vector and the movie-factor matrix. The result is a vector
(of length 1682 , that is, the number of movies we have) with the predicted rating for each
movie. We will then sort these predictions by the predicted rating:
val allRecs = model.userFeatures.map{ case (userId, array)
=>
val userVector = new DoubleMatrix(array)
val scores = imBroadcast.value.mmul(userVector)
val sortedWithId = scores.data.zipWithIndex.sortBy(-_._1)
val recommendedIds = sortedWithId.map (_._2 + 1 ).toSeq
(userId, recommendedIds)
}
You will see the following on the screen:
allRecs: org.apache.spark.rdd.RDD[(Int, Seq[Int])] =
MappedRDD[269] at map at <console>:29
As we can see, we now have an RDD that contains a list of movie IDs for each user ID.
These movie IDs are sorted in order of the estimated rating.
Tip
Note that we needed to add 1 to the returned movie ids (as highlighted in the preceding
code snippet), as the item-factor matrix is 0-indexed, while our movie IDs start at 1 .
Search WWH ::




Custom Search