Databases Reference
In-Depth Information
su ciently few positive examples that we can conclude all of these items are in
the “don't-like” class. We may then put a leaf with decision “don't like” as the
right child of the root. However, the articles that satisfy the predicate includes
a number of articles that user U doesn't like; these are the articles that mention
the Yankees. Thus, at the left child of the root, we build another predicate.
We might find that the predicate “Yankees” OR “Jeter” OR “Teixeira” is the
best possible indicator of an article about baseball and about the Yankees.
Thus, we see in Fig. 9.3 the left child of the root, which applies this predicate.
Both children of this node are leaves, since we may suppose that the items
satisfying this predicate are predominantly negative and those not satisfying it
are predominantly positive.
2
Unfortunately, classifiers of all types tend to take a long time to construct.
For instance, if we wish to use decision trees, we need one tree per user. Con-
structing a tree not only requires that we look at all the item profiles, but we
have to consider many different predicates, which could involve complex com-
binations of features. Thus, this approach tends to be used only for relatively
small problem sizes.
9.2.8
Exercises for Section 9.2
Exercise 9.2.1 : Three computers, A, B, and C, have the numerical features
listed below:
Feature
A
B
C
Processor Speed
3.06
2.68
2.92
Disk Size
500
320
640
Main-Memory Size
6
4
6
We may imagine these values as defining a vector for each computer; for in-
stance, A's vector is [3.06, 500, 6]. We can compute the cosine distance between
any two of the vectors, but if we do not scale the components, then the disk
size will dominate and make differences in the other components essentially in-
visible. Let us use 1 as the scale factor for processor speed, α for the disk size,
and β for the main memory size.
(a) In terms of α and β, compute the cosines of the angles between the vectors
for each pair of the three computers.
(b) What are the angles between the vectors if α = β = 1?
(c) What are the angles between the vectors if α = 0.01 and β = 0.5?
! (d) One fair way of selecting scale factors is to make each inversely propor-
tional to the average value in its component. What would be the values
of α and β, and what would be the angles between the vectors?
Search WWH ::




Custom Search