Java Reference
In-Depth Information
values to further simplify this discussion. For age , bin-1 contains values
less than or equal to 35 and bin-2 contains the values greater than 35.
For savings balance , bin-1 contains values less than or equal to $20,000
and bin-2 contains values greater than $20,000. In JDM, a naïve bayes
algorithm computes the probabilities of a target value for a given
attribute value using the cases in the build dataset. In this example,
we have two attributes with two binned values for a binary target.
Listing 7-1 shows the list of eight possible probabilities that are
computed as part of the naïve bayes model build. Using these proba-
bility values, the naïve bayes algorithm computes the most probable
target value for a given new case. In this example, for a new customer
whose age
$13,300, the probability of being
an Attriter and Non-Attriter is computed as shown in Listing 7-2. Note
that in Listing 7-2 P ( Attriter ) and P ( Non-Attriter ) are prior-probabilities of
the target values that are specified as input to the model build. For
this new customer case, the probability of being a Non-attriter ( 0.31 ) is
more than that of an Attriter ( 0.03 ) and hence the model predicts this
customer as a Non-attriter . For a more detailed discussion on naïve
bayes and bayesian classification refer to [Han/Kamber 2006].
25 and savings balance
Algorithm Settings
In JDM, a naïve bayes algorithm has two settings, singleton threshold ,
and pairwise threshold , that are used to define which predictor
attribute values or predictor-target value pairs should be ignored.
Listing 7-1
Naïve bayes algorithm computation of probabilities using build data
Probability of age < 35 when the customer is Attriter
P( age < 35 / Attriter ) 2/6 0.33
Probability of age < 35 when the customer is Non-attriter
P( age < 35 / Non-attriter ) 4/6 0.64
Probability of age > 35 when the customer is Attriter
P( age > 35 / Attriter ) 3/4 0.75
Probability of age > 35 when the customer is Non-attriter
P( age > 35 / Non-attriter ) 1/4 0.25
Probability of savings balance (SB) < 20000 when the customer is Attriter
P( SB < 20000 / Attriter ) 3/7 0.43
Probability of savings balance (SB) < 20000 when the customer is Non-attriter
P( SB < 20000 / Non-attriter ) 4/7 0.57
Probability of savings balance (SB) > 20000 when the customer is Attriter
P( SB > 20000 / Attriter ) 3/3 1.00
Probability of savings balance (SB) > 20000 when the customer is Non-attriter
P( SB > 20000 / Non-attriter ) 0/3 0.00
Search WWH ::




Custom Search