Graphics Reference
In-Depth Information
Algorithm 8 MIFS algorithm.
function MIFS( F - all features in data, S - set of selected features, k - desired size of S , β
-
regulator parameter)
initialize: S ={}
for each feature f i in F do
Compute I( C , f i )
end for
Find f max that maximizes I( C , f )
F = F −{ f max }
S = S f max
repeat
for all couples of features
(
f i
F
,
s j
S
)
do
Compute I( f i , s j )
end for
Find f max that maximizes I( C , f )
β s S I( f i , s j )
F = F
−{
f max
}
S = S f max
until
|
S
|=
k
return S
end function
three combinations C16, C17 and C18 constitute the stochastic methods differing
on the evaluation measure. In them, features are randomly generated and a fitness
function, closely related to the evaluation measures, is defined.
As the most typical stochastic techniques, we will discuss two methods here: LVF
(C17) and LVW (C18).
LVF is the acronym of Las Vegas Filter FS [ 31 ]. It consists of a random procedure
that generates random subsets of features and an evaluation procedure that checks
if each subset satisfies the chosen measure. For more details about LVF, see the
Algorithm 7. The evaluation measure used in LVF is inconsistency rate. It receives as
parameter the allowed inconsistency rate , that can be estimated by the inconsistency
rate of the data considering all features. The other parameter is the maximum number
of subsets to be generated in the process, which acts as a stopping criterion.
In Algorithm 9, maxTries is a number proportional to the number of original
features (i.e., l
M , being l a pre-defined constant). The rule of thumb is that
the more features the data has, the more difficult the FS task is. Another way for
setting maxTries is to relate to the size of the search space we want to explore. If the
complete search space is 2 M and if we want to cover a p % of the entire space, then
l
×
2 M
p %.
LVF can be easily modified due to the simplicity of the algorithm. Changing the
evaluation measure is the only thing we can do to keep it within this category. If we
decide to use the accuracy measure, we will obtain the LVW ( Las Vegas Wrapper
FS) method. For estimating the classifier's accuracy, we usually draw on statistical
validation, such as 10-FCV. LVF and LVW could be very different in terms of run
time as well as subsets selected. Another difference of LVW regarding LVF is that the
learning algorithm LA requires its input parameters to be set. Function estimate() in
=
·
 
Search WWH ::




Custom Search