Databases Reference
In-Depth Information
be little reason to try to cluster into a small number of clusters immediately.
Rather, a hierarchical approach, where we leave many clusters unmerged may
su ce as a first step.
For example, we might leave half as many clusters as
there are items.
HP
TW
SW
A
4
5
1
B
4.67
C
2
4.5
D
3
3
Figure 9.7: Utility matrix for users and clusters of items
Example 9.10 : Figure 9.7 shows what happens to the utility matrix of Fig. 9.4
if we manage to cluster the three Harry-Potter movies into one cluster, denoted
HP, and also cluster the three Star-Wars movies into one cluster SW.
2
Having clustered items to an extent, we can revise the utility matrix so the
columns represent clusters of items, and the entry for user U and cluster C is
the average rating that U gave to those members of cluster C that U did rate.
Note that U may have rated none of the cluster members, in which case the
entry for U and C is still blank.
We can use this revised utility matrix to cluster users, again using the dis-
tance measure we consider most appropriate. Use a clustering algorithm that
again leaves many clusters, e.g., half as many clusters as there are users. Re-
vise the utility matrix, so the rows correspond to clusters of users, just as the
columns correspond to clusters of items. As for item-clusters, compute the
entry for a user cluster by averaging the ratings of the users in the cluster.
Now, this process can be repeated several times if we like. That is, we can
cluster the item clusters and again merge the columns of the utility matrix that
belong to one cluster. We can then turn to the users again, and cluster the
user clusters. The process can repeat until we have an intuitively reasonable
number of clusters of each kind.
Once we have clustered the users and/or items to the desired extent and
computed the cluster-cluster utility matrix, we can estimate entries in the orig-
inal utility matrix as follows. Suppose we want to predict the entry for user U
and item I:
(a) Find the clusters to which U and I belong, say clusters C and D, respec-
tively.
(b) If the entry in the cluster-cluster utility matrix for C and D is something
other than blank, use this value as the estimated value for the U -I entry
in the original utility matrix.
Search WWH ::




Custom Search