Java Reference
In-Depth Information
node size to avoid tree nodes with low support,
maximum confidence
to avoid pure nodes, and
minimum decrease
in impurity
to avoid node
splits that gain only minimal increase in predictive accuracy of the
prediction. Users can specify one or more of these stopping criteria,
and the tree will grow until the first stopping criteria is met.
Pruning is the process of removing the less significant tree nodes,
for example, those with insufficient support. There are two types of
pruning:
pre-pruning
and
post-pruning
. Pre-pruning avoids insignificant
node splits while building the tree by measuring the goodness of the
split. Post-pruning removes the insignificant nodes after building a
fully grown
tree. Different measures called tree homogeneity metrics
are used to define the goodness of a node split, such as
gini, entropy,
mean absolute deviation, mean square error,
and
misclassification ratio
.
Tree
homogeneity metrics
are also known as
information gain
. Refer to [Han/
Kamber 2006] for more details about the tree homogeneity metrics.
Naïve Bayes
The naïve bayes algorithm is one of the fastest classification algo-
rithms. It produces results comparable to other algorithms, often out-
performing other classification algorithms. Naïve bayes works well
with large volumes of data.
Overview
Naive bayes is based on
Bayes Theorem
[Han/Kamber 2006] and
assumes that the predictor attributes are
conditionally independent
2
[Wikipedia-CI 2006] of each other with respect to the target attribute.
This assumption significantly reduces the number of computations
required to predict a target value and hence the
naïve bayes
algorithm
performs well with large volumes of data.
The naïve bayes algorithm involves computing the probability of
each target and predictor attribute value combination. To control the
number of such combinations, attributes that have either continuous
values or a high number of distinct values are typically binned. Refer
to Section 3.2 for more detailed discussion on binning. In this
example, to simplify the description of the naïve bayes algorithm,
consider two attributes
age
and
savings balance
from the CUSTOMERS
(Table 7-3) dataset. These attributes are binned to have two binned
2
Two events
A
and
B
are conditionally independent given a third event
C
precisely if the occurrence or non-occurrence of
A
and
B
are independent events
in their conditional probability distribution given
C
. In other words,
Pr (
A
B
C
)
Pr(
A
C
) Pr(
B
C
).
Search WWH ::
Custom Search