Recommendation Systems - Mining of Massive Datasets

Databases Reference

In-Depth Information

Now, consider Example 9.4. There, we observed that the vector for a user

will have positive numbers for stars that tend to appear in movies the user

likes and negative numbers for stars that tend to appear in movies the user

doesn't like. Consider a movie with many stars the user likes, and only a few or

none that the user doesn't like. The cosine of the angle between the user's and

movie's vectors will be a large positive fraction. That implies an angle close to

0, and therefore a small cosine distance between the vectors.

Next, consider a movie with about as many stars the user likes as doesn't

like. In this situation, the cosine of the angle between the user and movie is

around 0, and therefore the angle between the two vectors is around 90 degrees.

Finally, consider a movie with mostly stars the user doesn't like. In that case,

the cosine will be a large negative fraction, and the angle between the two

vectors will be close to 180 degrees - the maximum possible cosine distance.

2

9.2.7

Classification Algorithms

A completely different approach to a recommendation system using item profiles

and utility matrices is to treat the problem as one of machine learning. Regard

the given data as a training set, and for each user, build a classifier that predicts

the rating of all items. There are a great number of different classifiers, and it

is not our purpose to teach this subject here. However, you should be aware

of the option of developing a classifier for recommendation, so we shall discuss

one common classifier - decision trees - briefly.

A decision tree is a collection of nodes, arranged as a binary tree. The

leaves render decisions; in our case, the decision would be “likes” or “doesn't

like.” Each interior node is a condition on the objects being classified; in our

case the condition would be a predicate involving one or more features of an

item.

To classify an item, we start at the root, and apply the predicate at the root

to the item. If the predicate is true, go to the left child, and if it is false, go to

the right child. Then repeat the same process at the node visited, until a leaf

is reached. That leaf classifies the item as liked or not.

Construction of a decision tree requires selection of a predicate for each

interior node. There are many ways of picking the best predicate, but they all

try to arrange that one of the children gets all or most of the positive examples

in the training set (i.e, the items that the given user likes, in our case) and the

other child gets all or most of the negative examples (the items this user does

not like).

Once we have selected a predicate for a node N , we divide the items into

the two groups: those that satisfy the predicate and those that do not. For

each group, we again find the predicate that best separates the positive and

negative examples in that group. These predicates are assigned to the children

That is, user vectors will tend to be much shorter than movie vectors, but only the direction

of vectors matters.

Mining of Massive Datasets

Search WWH ::

Custom Search

Home