Information Technology Reference
In-Depth Information
Discovering User Interests by Document
Classification
Loc Nguyen *
User interest is one of personal traits attracting researchers' attention in user
modeling and user profiling. User interest competes with user knowledge to
become the most important characteristics in user model. Adaptive systems need
to know user interests so that provide adaptation to user. For example, adaptive
learning systems tailor learning materials (lesson, example, exercise, test…) to
user interests. I propose a new approach for discovering user interest based on
document classification. The basic idea is to consider user interests as classes of
documents. The process of classifying documents is also the process of
discovering user interests. There are two new points of view:
The series of user access in his/her history are modeled as documents. So user
is referred indirectly to as “document”.
User interests are classes such documents are belong to.
Our approach includes four following steps:
1. Documents in training corpus are represented according to vector model . Each
element of vector is product of term frequency and inverse document
frequency. However the inverse document frequency can be removed from
each element for convenience.
2. Classifying training corpus by applying decision tree or support vector
machine or neural network. Classification rules (weight vectors W * ) are drawn
from decision tree (support vector machine). They are used as classifiers.
3. Mining user's access history to find maximum frequent itemsets . Each
itemset is considered a interesting document and its member items are
considered as terms. Such interesting documents are modeled as vectors.
4. Applying classifiers (see step 3) into these interesting documents in order to
choose which classes are most suitable to these interesting documents. Such
classes are user interests .
This approach bases on document classification but it also relates to information
retrieval in the manner of representing documents. Hence section 1 discusses
about vector model for representing documents. Support vector machine, decision
tree and neural network on document classification are mentioned in section 2, 3, 4.
 
Search WWH ::




Custom Search