Information Technology Reference
In-Depth Information
Table 11.3
(continued)
BN 1
BN 10 ( % )
Data set
Size
| C |
S N 1 ( x )
( % )
max N 1 ( x )
S N 10 ( x )
max N 10 ( x )
RImb
P ( c M ) (%)
Lighting2
121
2
1.15
9.1
5
0.36
28.8
23
0.206
60.3
Lighting7
143
7
1.63
21.0
6
0.39
39.1
26
0.15
26.6
MALLAT
2400
8
1.09
1.3
6
1.48
2.7
53
0.0
12.5
MedicalImages
1141
10
0.78
18.2
4
0.35
31.6
26
0.48
52.1
MoteStrain
1272
2
0.91
5.0
6
0.73
9.3
33
0.08
53.9
OliveOil
60
4
0.92
11.7
4
0.38
28.0
23
0.26
41.7
OSULeaf
442
6
1.07
25.3
5
0.63
44.8
29
0.11
21.9
0.05 0.1 21 0.0 14.3
SonyAIBORobotSurface 621 2 1.32 1.9 7 0.80 4.4 33 0.12 56.2
SonyAIBORobotSurfaceII 980 2 1.22 1.5 6 0.82 6.5 35 0.23 61.6
SwedishLeaf 1125 15 1.43 14.7 8 0.97 23.5 41 0.0 6.7
Symbols 1020 6 1.25 1.8 6 0.83 3.0 38 0.02 17.7
Synthetic-control 600 6 2.58 0.7 12 1.40 2.0 54 0.0 16.7
Trace 200 4 1.36 0.0 6 0.04 2.5 22 0.0 25
TwoLeadECG 1162 2 1.22 0.1 6 0.40 0.3 33 0.0 50
Two-Patterns 5000 4 2.07 0.0 14 1.01 0.1 46 0.02 26.1
Average 917.6 7.6 1.21 11.7 6.24 0.58 21.2 31.54 0.08 31.0
Each dataset is described both by a set of basic properties (size, number of classes) and some hubness-related quantities for two different neighborhood sizes,
namely: the skewness of the k -occurrence distribution ( S N k ( x ) ), the percentage of bad k -occurrences ( BN k ), the degree of the largest hub-point (max N k
Plane
210
7
1.03
0.0
5
(
x
)
).
Also, the relative imbalance of the label distribution is given, as well as the size of the majority class (expressed as a percentage of the entire dataset)
 
Search WWH ::




Custom Search