Databases Reference
In-Depth Information
(1) Initialize relative parameters: F = F ; S = φ ; D u = D ; D l = φ .
(2) Repeat.
(3) For each feature f
F do.
(4) Calculate its mutual information I ( C ; f )on D u ;
(5) If I ( C ; f )=0then F = F
f ;
(6) Choose the feature f with the highest I ( C ; f );
(7) S = SUf ; F = F
f ;
(8) Obtain new labeled instances D l from D u induced by f ;
(9) Remove them from D u , i.e., D u = D u
D l ;
(10) Until F = φ or
|
D u |
= I T .
This algorithm works in a straightforward way. It estimates mutual
information for each candidate feature in F with the label C .During
calculating step, feature will be immediately discarded from F if its
mutual information is zero. In this situation, the probability distribution
of the feature is fully random and it will not contribute to predict the
unlabeled instances Du . 70,71 After that, the feature with the highest mutual
information will be chosen. It is noticed that the search strategy in DMIFS
is sequential forward search. This means that the selected subset obtained
by DMIFS is an approximate one.
8.2.9. Learning to classify by ongoing feature selection
Existing classification algorithms use a set of training examples to select
classification features, which are then used for all future applications of the
classifier. A major problem with this approach is the selection of a training
set: a small set will result in reduced performance, and a large set will
require extensive training. In addition, class appearance may change over
time requiring an adaptive classification system. In this paper, we propose a
solution to these basic problems by developing an on-line feature selection
method, which continuously modifies and improves the features used for
classification based on the examples provided so far.
Online feature selection ( n ; k ; e ):
Given a time point in the online
learning process following the presentation of e examples and n features,
find the subset with k<n features that is maximally informative about
the class, estimated on the e examples. For computational eciency, an
on-line selection 23,72 method will also be of use when the set of features to
consider is large, even in a non-online scheme. It then becomes possible to
Search WWH ::




Custom Search