Database Reference
In-Depth Information
degrees. Moreover, averages are now 0 in every component, so the suggestion in
part (d) of Exercise 9.2.1 that we should scale in inverse proportion to the average
makes no sense. Suggest a way of finding an appropriate scale for each compon-
ent of normalized vectors. How would you interpret a large or small angle between
normalized vectors? What would the angles be for the normalized vectors derived
from the data in Exercise 9.2.1 ?
EXERCISE 9.2.3 A certain user has rated the three computers of Exercise 9.2.1 as follows:
A : 4 stars, B : 2 stars, C : 5 stars.
(a) Normalize the ratings for this user.
(b) Compute a user profile for the user, with components for processor speed, disk size,
and main memory size, based on the data of Exercise 9.2.1 .
9.3 Collaborative Filtering
We shall now take up a significantly different approach to recommendation. Instead of us-
ing features of items to determine their similarity, we focus on the similarity of the user
ratings for two items. That is, in place of the item-profile vector for an item, we use its
column in the utility matrix. Further, instead of contriving a profile vector for users, we
represent them by their rows in the utility matrix. Users are similar if their vectors are close
according to some distance measure such as Jaccard or cosine distance. Recommendation
for a user U is then made by looking at the users that are most similar to U in this sense,
and recommending items that these users like. The process of identifying similar users and
recommending what similar users like is called collaborative filtering .
9.3.1
Measuring Similarity
The first question we must deal with is how to measure similarity of users or items from
their rows or columns in the utility matrix. We have reproduced Fig. 9.1 here as Fig. 9.4 .
This data is too small to draw any reliable conclusions, but its small size will make clear
some of the pitfalls in picking a distance measure. Observe specifically the users A and C .
They rated two movies in common, but they appear to have almost diametrically opposite
opinions of these movies. We would expect that a good distance measure would make them
rather far apart. Here are some alternative measures to consider.
Search WWH ::




Custom Search