Database Reference
In-Depth Information
Table 3.14 ( continued )
Option
Setting
Functionality/reasoning for selection
Alpha for merging
0.05
This option determines the significance
level for the merging of predictor cat-
egories. Higher values (normally up to
0.10) hinder the merging of predictor cat-
egories and a value of 1.0 totally prevents
merging. Lower values (normally up to
0.01) facilitate the collapsing of predictor
categories
Minimum records
in parent branch/
minimum
records in child
branch
Minimum
100
records
for any
child
branch
These options specify the minimum allow-
able number of records in the parent and
child nodes of the tree. The tree growth
stopsifthesizeoftheparentorofthe
resulting child nodes is less than the spec-
ified values. The size of the parent node
should always be set higher (typically two
times higher) than the corresponding size
of the child nodes
The requested values can be expressed in
terms of a percentage of overall training
data or in terms of absolute number of
records
Although the respective settings also depend
on the total number of records available,
it is generally recommended to keep
the number of records for the terminal
(child) nodes at least above 100 and if
possible between 200 and 300. Large
values of these settings provide robust
rules ''supported'' by many cases/records
which we expect to perform well when
used in new datasets
SUMMARY
In this chapter we focused on the modeling techniques used in the context of
segmentation, PCA, and clustering techniques in particular.
PCA is an unsupervised data reduction technique usually applied in order to
prepare data for clustering. It is used for effectively replacing a large set of original
continuous fields with a core set of composite measures.
Search WWH ::




Custom Search