Applications - Minimum Error Entropy Classification

Information Technology Reference

In-Depth Information

Tabl e 6 . 27 Average error rates (with standard deviations) for the experiments

with single neural networks (SNN).

Dataset

SNN

n h

ArtificialF2

19.56 (3.95)

Breast Tissue

32.75 (3.26)

CTG

15.70 (0.60)

Diabetes

23.90 (1.69)

Olive

5.45 (0.62)

PB12

7.51 (0.37)

Sonar

21.90 (2.80)

results than single neural networks, when the X

T space possesses some

divisive properties.

6.6 Decision Trees

Decision trees (also known as classification trees) are attractive for applica-

tions requiring the semantic interpretation assignable to nodal decision rules,

for instance as diagnostic tools in the medical area. We will only be interested

in binary decision trees based on univariate splits, by far the most popular

type of decision trees. These classifiers have a hierarchical structure such that

any data instance x travels down the tree according to the fulfillment of a

sequence of binary (dichotomous) decisions (does x belong to ω k or to ω k ?),

evaluated on the basis of single variables. The traveling down stops at a tree

leaf where the respective class label is assigned to x .

Tree construction algorithms typically follow a greedy approach: find at

each node the best univariate decision rule, according to some criterion. De-

noting the j -thtreenodeby u j , a univariate split represents a binary test z j as

{

x ij <Δ j ,z j ( x i )= ω k ; ω k otherwise

}

for numerical inputs (i.e., real-valued

inputs) or as

for categorical inputs; Δ j

and B j are, respectively, a numerical threshold and a set of categories. The

search for the best univariate decision rule is, therefore, the search for the best

triple

{

x ij ∈

B j ,z j ( x i )= ω k ; ω k otherwise

}

for all possible combinations of features, classes,

and feature values, at u j ; equivalently, the search for the best data split ,since

the training dataset at u j is split into two training datasets, one sent to the

left child node, u jl (collecting, say, all data instances satisfying the split rule),

and the other sent to the right node, u jr (collecting the remaining instances).

During tree construction (tree growing) the splitting of u j into its children

nodes u jl and u jr goes on until some stopping rule is satisfied. Typically node

splitting stops when there is an insucient number of instances. Literature

on binary decision trees is abundant (see e.g., [33, 52, 80, 194, 188, 147, 161]).

{

x i ,ω k ,Δ j or B j }

Minimum Error Entropy Classification

Search WWH ::

Custom Search

Home