Database Reference
In-Depth Information
secondary. Prioritization of these criteria agrees with the assumption that
the exploitation phase is longer than the exploration phase.
Assuming that we use a decision tree as the classifier, we are able to
estimate the probability p i by locating the appropriate leaf k in the tree
that refers to the current instance x i . The frequency vector of each leaf node
captures the number of instances from each possible class. In the usual case
of target marketing, the frequency vector has the form: ( m k,accept ,m k,reject )
where m k,c denotes the number of instances in the labeled pool that reach
leaf k and satisfy y = c . According to Laplace's law of succession, the
probability p i is estimated as:
m k,accept +1
m k,accept + m k,reject +2 .
p i = p ( m k,accept ,m k,reject )=
(12.10)
Besides estimating the point probability p i , we are interested in
estimating a confidence interval for this probability. An approach to a
customer can be considered as a Bernoulli trial. For the sake of simplicity,
we approximate the confidence interval of the Bernoulli parameter with the
normal approximation to the binomial distribution:
p i
z 1 −α/ 2 σ i <p i < p i + z 1 −α/ 2 σ i
p i )
m k,accept + m k,reject
p i (1
σ i = σ ( m k,accept ,m k,reject )=
(12.11)
where σ i represents the estimated standard deviation and z 1 −α/ 2 denotes
the value in the standard normal distribution table corresponding to the
1
α/ 2 percentile. For a small n we can use the actual binomial distribution
to estimate the interval.
To demonstrate the importance of a confidence level, consider two
leaves: leaf A and leaf B in a classification tree. Each leaf holds the
customers in the labeled pool that fits its path. These customers are labeled
as either “accept” or “reject”. If the “accept”/“reject” proportions are the
same, then according to Eq. (12.10), both leaves have the same estimated
probability. Given this, if leaf A has more customers than leaf B ,then
according to Eq. (12.11), leaf B has a larger confidence interval. Thus,
acquiring an instance to leaf B will have a greater impact on the class
distribution than adding an example to leaf A . In the initial iterations,
when the data are limited and the confidence intervals are large, obtaining
an additional instance to the correct leaf is especially important. Moreover,
the potential contribution of labeling the i th instance in the same leaf and
Search WWH ::




Custom Search