Advanced Analytical Theory and Methods: Classification - Data Science and Big Data Analytics

Database Reference

In-Depth Information

Section 7.1 has introduced a bank marketing dataset ( Figure 7.3 ). This section

shows how to use the naïve Bayes classifier on this dataset to predict if the clients

would subscribe to a term deposit.

Building a naïve Bayes classifier requires knowing certain statistics, all calculated

from the training set. The first requirement is to collect the probabilities of all

class labels, . In the presented example, these would be the probability that

a client will subscribe to the term deposit and the probability the client will not.

From the data available in the training set,

and

.

The second thing the naïve Bayes classifier needs to know is the conditional

probabilities of each attribute given each class label , namely . The

training set contains several attributes: job , marital , education , default ,

housing , loan , contact , and poutcome . For each attribute and its possible

values, computing the conditional probabilities given or

is required. For example, relative to the marital attribute, the

following conditional probabilities are calculated.

After training the classifier and computing all the required statistics, the naïve

Bayes classifier can be tested over the testing set. For each record in the testing

set, the naïve Bayes classifier assigns the classifier label

that maximizes

.

Search WWH ::

Custom Search

Home