Hubness-Aware Classification, Instance Selection and Feature Construction: Survey and Extensions to Time-Series - Feature Selection for Data and Pattern Recognition - page 241

Information Technology Reference

In-Depth Information

Table 11.3

(continued)

BN 1

BN 10 ( % )

Data set

Size

| C |

S N 1 ( x )

( % )

max N 1 ( x )

S N 10 ( x )

max N 10 ( x )

RImb

P ( c M ) (%)

Lighting2

121

2

1.15

9.1

5

0.36

28.8

23

0.206

60.3

Lighting7

143

7

1.63

21.0

6

0.39

39.1

26

0.15

26.6

MALLAT

2400

8

1.09

1.3

6

1.48

2.7

53

0.0

12.5

MedicalImages

1141

10

0.78

18.2

4

0.35

31.6

26

0.48

52.1

MoteStrain

1272

2

0.91

5.0

6

0.73

9.3

33

0.08

53.9

OliveOil

60

4

0.92

11.7

4

0.38

28.0

23

0.26

41.7

OSULeaf

442

6

1.07

25.3

5

0.63

44.8

29

0.11

21.9

0.05 0.1 21 0.0 14.3

SonyAIBORobotSurface 621 2 1.32 1.9 7 0.80 4.4 33 0.12 56.2

SonyAIBORobotSurfaceII 980 2 1.22 1.5 6 0.82 6.5 35 0.23 61.6

SwedishLeaf 1125 15 1.43 14.7 8 0.97 23.5 41 0.0 6.7

Symbols 1020 6 1.25 1.8 6 0.83 3.0 38 0.02 17.7

Synthetic-control 600 6 2.58 0.7 12 1.40 2.0 54 0.0 16.7

Trace 200 4 1.36 0.0 6 0.04 2.5 22 0.0 25

TwoLeadECG 1162 2 1.22 0.1 6 0.40 0.3 33 0.0 50

Two-Patterns 5000 4 2.07 0.0 14 1.01 0.1 46 0.02 26.1

Average 917.6 7.6 1.21 11.7 6.24 0.58 21.2 31.54 0.08 31.0

Each dataset is described both by a set of basic properties (size, number of classes) and some hubness-related quantities for two different neighborhood sizes,

namely: the skewness of the k -occurrence distribution ( S N k ( x ) ), the percentage of bad k -occurrences ( BN k ), the degree of the largest hub-point (max N k

Plane

210

7

1.03

0.0

5

−

(

x

)

).

Also, the relative imbalance of the label distribution is given, as well as the size of the majority class (expressed as a percentage of the entire dataset)

Next Page

Feature Selection for Data and Pattern Recognition

Search WWH ::

Custom Search

Home