Building a Recommendation Engine with Spark - Machine Learning with Spark

Database Reference

In-Depth Information

Representation of an implicit preference and confidence matrix

The implicit model still creates a user- and item-factor matrix. In this case, however, the

matrix that the model is attempting to approximate is not the overall ratings matrix but the

preference matrix P. If we compute a recommendation by calculating the dot product of a

user- and item-factor vector, the score will not be an estimate of a rating directly. It will

rather be an estimate of the preference of a user for an item (though not strictly between 0

and 1, these scores will generally be fairly close to a scale of 0 to 1).

Alternating least squares

Alternating Least Squares ( ALS ) is an optimization technique to solve matrix factoriza-

tion problems; this technique is powerful, achieves good performance, and has proven to

be relatively easy to implement in a parallel fashion. Hence, it is well suited for platforms

such as Spark. At the time of writing this topic, it is the only recommendation model im-

plemented in MLlib.

ALS works by iteratively solving a series of least squares regression problems. In each it-

eration, one of the user- or item-factor matrices is treated as fixed, while the other one is

updated using the fixed factor and the rating data. Then, the factor matrix that was solved

for is, in turn, treated as fixed, while the other one is updated. This process continues until

the model has converged (or for a fixed number of iterations).

Note

Spark's documentation for collaborative filtering contains references to the papers that un-

derlie the ALS algorithms implemented each component of explicit and implicit data. You

Search WWH ::

Custom Search

Home