Database Reference
In-Depth Information
documents follow an exponential distribution. Training data can be used to
estimate the model parameters, and the threshold can be found by optimizing
the expected utility under the estimated model (7). However, an adaptive
filtering system only receives feedback for documents delivered/rated by the
user; thus model estimation techniques based on random sampling assumption
usually lead to biased estimation and should be avoided (72).
8.3.2.2
Filtering as text classification
Text classification is another well studied area. A typical classification
system learns a classifier from a labeled training dataset, and then classifies
unlabeled testing documents into different classes. A popular approach is to
treat filtering as a text classification task by defining two classes: relevant
vs. non-relevant. The filtering system learns a user profile as a classifier
and delivers a document to the user if the classifier thinks it is relevant or
the probability of relevance is high. The state of the art text classification
algorithms, such as support vector machines (SVM), K nearest neighbors (K-
NN), neural networks, logistic regression, and Winnow, have been used to
solve this binary classification task (32) (13) (46) (64) (71)(54) (38) (61) (30)
(55).
Instead of minimizing classification error, an adaptive filtering system needs
to optimize the standard evaluation measure, such as a user utility. For ex-
ample, in order to optimize the utility measure T 11 U =2 R +
N + (Equation
8.4), a filtering system usually delivers a document to the user if the prob-
ability of relevance is above 67% (45). Some machine learning approaches,
such as logistic regression or neural networks, estimate the probability of rel-
evance directly, which makes it easier to make the binary decision of whether
to deliver a document.
Many standard text classification algorithms do not work well for a new
user, which usually means no or few training data points. Some new ap-
proaches have been developed for initialization. For example, researchers have
found that retrieval techniques, such as Rocchio, work well at the early stage
of filtering when the system has very few training data. Statistical text classi-
fication techniques, such as logistic regression, work well at the later stage of
filtering when the system has accumulated enough training data. Techniques
have been developed to combine different algorithms, and their results are
promising (71). Yet another example discussed in the following section is to
initialize the profile of a new user based on training data from existing users.
It is worth mentioning that when adapting a text classification technique
to the adaptive filtering task, one needs to pay attention that the classes are
extremely unbalanced, because most documents are not relevant. The fact
that the training data are not sampled randomly is also a problem that has
not been well studied.
Search WWH ::




Custom Search