Graphics Reference
In-Depth Information
. Vowel recognition data. he data collection process involves digital sampling
speechwithacousticsignalprocessing,followedbyrecognitionofthephonemes,
groups of phonemes and words. he goal here is a speaker-independent rule
based on ten variables of eleven vowels that occur in various words spoken (re-
corded and processed) by fiteen British male and female speakers. Deterding
Deterding( )collectedthisdatasetofvowels,whichcanbefoundintheCMU
benchmark repository in the WWW. here are entries for training and
for testing. hree other types of classifiers were also applied to this dataset: neu-
ral networks and k-NN by Robinson and Fallside ( ), and decision trees by
Shang and Breiman ( ).For the sake of variety, both versions of our classifier
were used and a somewhat different error test procedure was used. he results
are shown in Table . .
. A neural-pulse dataset. his has interesting and unusual features. here are two
classes of neurons, whoseoutputs to stimuli are to be distinguished. heyconsist
of different pulses measured in a monkey's brain (poor thing!). here are
samples with variables (the pulses). his dataset was given to me by a very
competent group (that of Prof. Coiffman, CS & Math. Depts. at Yale Univ.), who
hadbeenworkingonitbuthadbeenunabletoobtainaviablerulewiththeclassi-
fication methodstheyused.Remarkably, withNCconvergence isobtained based
on only nine of the parameters. he resulting ordering shows a striking sepa-
ration. In Fig. . , the first pair of variables x , x is plotted as originally given
on the let. On the right, the best pair x , x , as chosen by the classifier's order-
ing, speaks foritself.Bythe way,todiscover this finding manually would require
the construction of a scatterplot matrix with pairs, and then careful inspec-
tion andcomparison of theindividual plots.heimplementation providesallthe
next bestsections tocomplete the rule'svisualization. hedataset consists oftwo
“pretzel-like” clusters winding closely in -D, one (the complement in this case)
enclosing the other. Note that the classifier can actually describe highly complex
regions that carve the cavity shown. One can understand why the separation of
clusters by hyperplanes or nearest-neighbor techniques can fail badly on such
datasets. he rule has an error of %.
Table . . Summary of classification results for the vowel dataset
Rank
Classifier
Testing mode
Test error rate %
Nested Cavities (NC)
Cross-validation
.
CART-DB
Cross-validation
.
Nested Cavities (NC)
Train & Test
.
CART
Cross-validation
.
k - N N
T r a i n & T e s t
.
R B F
T r a i n & T e s t
.
Multilayer perceptron
Train & Test
.
Single-layer perceptron
Train & Test
.
Search WWH ::




Custom Search