Information Technology Reference
In-Depth Information
9.3.3 The Lymphography Problem
The lymphography problem is a complex real-world problem with four differ-
ent classes (normal, metastases, malign lymph, and fibrosis) and 18 attributes,
three of which are numeric and the remaining nominal (Table 9.7). The original
dataset contains a total of 148 instances, 104 of which were selected at random
for training with the remaining 44 used for testing. The class distribution in
this dataset is pretty unbalanced with just two instances for class “normal”,
81 for class “metastases”, 61 for “malign lymph”, and just four for “fibrosis”.
As shown in Table 9.7, for this problem, the attribute set will be repre-
sented by A = {A, …, R}. The terminal set will consist of T = {a, b, c, d},
representing, respectively, the four different classes: normal, metastases,
malign lymph, and fibrosis. For the numeric attributes, a set of 10 random
Table 9.7
Organization of the lymphography dataset.
Attribute
Symbol
Branches
Arity
LYMPHATICS
A
normal, arched, deformed, displaced
4
no, yes
BLOCK_OF_AFFERE
B
2
BL_OF_LYMPH_C
C
no, yes
2
no, yes
BL_OF_LYMPH_S
D
2
BY_PASS
E
no, yes
2
no, yes
EXTRAVASATES
F
2
REGENERATION_OF
G
no, yes
2
no, yes
EARLY_UPTAKE_IN
H
2
LYM_NODES_DIMIN
I
numeric (integer)
2
numeric (integer)
LYM_NODES_ENLAR
J
2
CHANGES_IN_LYM
K
bean, oval, round
3
no, lacunar, lac_margin, lac_central
DEFECT_IN_NODE
L
4
CHANGES_IN_NODE
M
no, lacunar, lac_margin, lac_central
4
no, grainy, drop_like, coarse, diluted,
reticular, stripped, faint
CHANGES_IN_STRU
N
8
no, chalices, vesicles
SPECIAL_FORMS
O
3
DISLOCATION_OF
P
no, yes
2
no, yes
EXCLUSION_OF_NO
Q
2
NO_OF_NODES_IN
R
numeric (integer)
2
Search WWH ::




Custom Search