Database Reference
In-Depth Information
Assuming that the apriori probability for a i to be relevant is equal to that
of not being relevant:
ω
ω
P ( a i /
B
|
B j ) >
P ( a i
B
|
B j ) .
(13.17)
j =1
j =1
Using the complete probability theorem:
ω
ω
P ( a i /
B
|
B j ) >
(1
P ( a i /
B
|
B j )) .
(13.18)
j =1
j =1
Because we are using non-ranker feature selectors the above probability is
estimated using:
P ( a/
B
|
a
B j )
if a i
B j
P ( a i
/
B
|
B j )
.
(13.19)
P ( a/
B
|
a/
B j )
if a i /
B j
Note that P ( a/
B j ) does not refer to a specific attribute, but to the
general bias of the feature selector j . In order to estimate the remaining
probabilities, we are adding to the dataset a set of φ contrast attributes
that are known to be truly irrelevant and analyzing the number of artificial
features φ j included in the subset B j obtained by the feature selector j :
B
|
a
B )= φ j
φ j
φ .
P ( a
B j |
a/
φ ;
P ( a/
B j |
a/
B )=1
(13.20)
The artificial contrast variables are obtained by randomly permuting
the values of the original n attributes across m instances. Generating
just random attributes from some simple distribution, such as Normal
Distribution, is not sucient, because the values of original attributes may
exhibit some special structure. Using Bayes theorem:
P ( a∈ B |a ∈ B j )= P ( a/
B ) P ( a
B j |
a/
B )
P ( a
B j )
P ( a/
B )
φ j
φ
=
(13.21)
P ( a
B j )
B j )= P ( a∈ B ) P ( a∈ B j |
a/
B )
P ( a/
B
|
a/
P ( a/
B j )
1
,
P ( a/
B )
φ j
φ
=
(13.22)
1
P ( a
B j )
B j )= |B j |
n + φ
where P ( a
Search WWH ::




Custom Search