Hubness-Aware Classification, Instance Selection and Feature Construction: Survey and Extensions to Time-Series - Feature Selection for Data and Pattern Recognition - page 234

Information Technology Reference

In-Depth Information

Fig. 11.6 Running example

used to illustrate

hubness-aware classifiers.

Instances belong to two

classes, denoted by circles

and rectangles .The triangle

is an instance to be classified

rectangles (instances 7-10) denote the training data: circles belong to class 1, while

rectangles belong to class 2. The triangle (instance 11) is an instance that has to be

classified.

For simplicity, we use k

for the

instances of the training data. For each training instance shown in Fig. 11.6 , an arrow

denotes its nearest neighbor in the training data. Whenever an instance x is a good

neighbor of x , there is a continuous arrow from x to x . In cases if x is a bad neighbor

of x , there is a dashed arrow from x to x .

We can see, e.g., that instance 3 appears twice as good nearest neighbor of other

train instances, while it never appears as bad nearest neighbor, therefore, GN 1 (

=

1 and we calculate N 1 (

x

)

, GN 1 (

x

)

and BN 1 (

x

)

x 3 ) =

2, BN 1 (

x 3 ) =

0 and N 1 (

x 3 ) =

GN 1 (

x 3 ) +

BN 1 (

x 3 ) =

2. For instance 6, the situation

is the opposite: GN 1 (

x 6 ) =

0, BN 1 (

x 6 ) =

2 and N 1 (

x 6 ) =

GN 1 (

x 6 ) +

BN 1 (

x 6 ) =

2,

while instance 9 appears both as good and bad nearest neighbor: GN 1 (

x 9 ) =

1,

BN 1 (

x 9 ) =

1 and N 1 (

x 9 ) =

GN 1 (

x 9 ) +

BN 1 (

x 9 ) =

2. The second, third and fourth

columns of Table 11.2 show GN 1 (

for each instance and the

calculated means and standard deviations of the distributions of GN 1 (

x

)

, BN 1 (

x

)

and N 1 (

x

)

x

)

, BN 1 (

x

)

and

N 1 (

.

While calculating N k (

x

)

1. Note, however,

that we do not necessarily have to use the same k for the k NN classification of the

unlabeled/test instances. In fact, in case of k NN classification with k

x

)

, GN k (

x

)

and BN k (

x

)

,weused k

=

1, only

one instance is taken into account for determining the class label, and therefore the

weighting procedure described above does not make any difference to the simple 1

nearest neighbor classification. In order to illustrate the use of the weighting proce-

dure, we classify instance 11 with k

=

=

2 nearest neighbor classifier, while N k (

x

)

,

GN k (

1. The two nearest neighbors of instance

11 are instances 6 and 9. The weights associated with these instances are:

x

)

, BN k (

x

)

were calculated using k

=

BN 1

(

x 6

) − μ BN 1 ( x )

2

−

0

.

3

e −

e −

e − h b ( x 6 ) =

˃

w 6 =

BN 1 (

x

)

=

=

0

.

0806

0

.

675

Next Page

Feature Selection for Data and Pattern Recognition

Search WWH ::

Custom Search

Home