Database Reference
In-Depth Information
Collaborative filtering
Collaborative filtering is a form of wisdom of the crowd approach where the set of prefer-
ences of many users with respect to items is used to generate estimated preferences of users
for items with which they have not yet interacted. The idea behind this is the notion of sim-
ilarity.
In a user-based approach, if two users have exhibited similar preferences (that is, patterns
of interacting with the same items in broadly the same way), then we would assume that
they are similar to each other in terms of taste. To generate recommendations for unknown
items for a given user, we can use the known preferences of other users that exhibit similar
behavior. We can do this by selecting a set of similar users and computing some form of
combined score based on the items they have shown a preference for. The overall logic is
that if others have tastes similar to a set of items, these items would tend to be good candid-
ates for recommendation.
We can also take an item-based approach that computes some measure of similarity
between items. This is usually based on the existing user-item preferences or ratings. Items
that tend to be rated the same by similar users will be classed as similar under this ap-
proach. Once we have these similarities, we can represent a user in terms of the items they
have interacted with and find items that are similar to these known items, which we can
then recommend to the user. Again, a set of items similar to the known items is used to
generate a combined score to estimate for an unknown item.
The user- and item-based approaches are usually referred to as nearest-neighbor models,
since the estimated scores are computed based on the set of most similar users or items
(that is, their neighbors).
Finally, there are many model-based methods that attempt to model the user-item preferen-
ces themselves so that new preferences can be estimated directly by applying the model to
unknown user-item combinations.
Matrix factorization
Since Spark's recommendation models currently only include an implementation of matrix
factorization, we will focus our attention on this class of models. This focus is with good
reason; however, these types of models have consistently been shown to perform extremely
well in collaborative filtering and were among the best models in well-known competitions
such as the Netflix prize.
Search WWH ::




Custom Search