Database Reference
In-Depth Information
items and can rate them. From the most rated items in each cluster (say
the top 10%), we select the item that maximizes the Euclidian distance
from the counter cluster. Finally, it should be noted that the same item
cannot be used twice (i.e. in two different pairwise comparisons). First, this
prevents monotonous visuals that might bore the user. More importantly,
we get hold of a better picture of user preferences by obtaining her response
from a variety of items from the same cluster. While the above selection of
items is ad-hoc, the reader should take into consideration that this is only
a secondary criterion that follows pairwise cluster selection. We focus on
finding good enough items in an almost instantaneous time.
16.3.8
Training a Lazy Decision Tree
We utilize decision trees to initiate a profile for a new user. In particular,
we use the top-down lazy decision tree (Lazy decision tree) as the base
algorithm. However, we use a different splitting criterion described in
Equation (16.9) as the splitting criterion. The pairwise comparisons are
located at the decision nodes. Each node in the tree is associated with a set
of corresponding users. This allows the user to quit the profile initialization
process anytime she wants. If the user does not wish to answer more
questions, we take the list of users that are associated with the current node
and calculate the mean of their factorized vectors. The more questions the
user answers, the more specific her vectorbecomes.Butevenwithanswering
only a few questions we can still provide the user a profile vector.
Figure 16.6 illustrates the root of the decision tree. Each inner node
represents a different pairwise comparison. Based on the response to the
first (root) question, the decision tree is used to select the next pairwise
comparison to be asked. Note that one possible outcome is “unknown”,
this option is selected when the user does not recognize the items. Every
path in the decision tree represents a certain set of comparisons with which
the user is presented. Assuming that there are k clusters, then the longest
possible path contains
( k− 1)
2
inner nodes (comparisons).
Note that if there are s possible outcomes in each pairwise comparison
and there are k clusters, then the complete decision tree will have s k∗ ( k− 1) / 2
leaves. Even for moderate values of k this becomes impractical. Thus,
instead of building the entire decision tree, we take the lazy approach. If
the current newcomer has reached a leaf and is willing to answer additional
questions, only then do we expand the current node using the suggested
splitting criterion. In this way, the tree expands gradually and on-demand.
Given a limited memory size, we keep only the most frequent visited nodes.
Search WWH ::




Custom Search