Database Reference
In-Depth Information
Example
In our spam-example above, the class-independence assumption says that the oc-
currence s of the different words in the email are independent of each other, given
the class. More specifically, in the spam emails, every word has a probability that
it occurs, but all words occur independently; if “a” occurs with 20% probability,
and “the” with 50% probability in spam mails, the probability that both words
occur in the spam email is 20% times 50% = 10%; the only factor that influences
the probability of words occurring is if it is a spam email or not. Obviously this
assumption will be violated in real emails. Nevertheless, many spam filters suc-
cessfully use Naive Bayes classifiers even though they are based upon an unrealis-
tic assumption.
As discussed in much detail in Chapter 3, often there is a need to learn classifiers
that do not discriminate with respect to certain sensitive attributes, e.g., gender,
even though the labels in the training data themselves may represent a discrimina-
tory situation. In Chapters 12 and 13, preprocessing techniques and an adapted
decision tree learner for discrimination-aware classification have already been in-
troduced. In this chapter, we provide three methods to make a Naive Bayes model
discrimination-free:
1. Use different decision thresholds for every sensitive attribute value; e.g., fe-
males need a lower score than man to get the positive label.
2. Learn a different model for every sensitive attribute value and use different de-
cision thresholds.
3. Add an attribute for the actual non-discriminatory class to a specialized Naive
Bayes model and try to learn the actual class values of every row in the data-set
using the expectation-maximization algorithm.
Note, however, that all of the above methods can be seen as a type of positive dis-
crimination: they assume an equal treatment of every sensitive attribute value and
force the predictor to satisfy this assumption, sacrificing predictive accuracy in the
process. Thus, although the off-the-shelf classifier considers it more likely for
some people to be assigned a positive class, they are forcibly assigned a negative
class in order to reduce discrimination, i.e., they are discriminated positively.
Since positive discrimination is considered illegal in several countries, these me-
thods should be applied with care. Applying predictive tools untouched, however,
should also be done with care since they are very likely to be discriminating: they
make use of any correlation in order to improve accuracy, also the correlation be-
tween sensitive and class attributes.
Since it is impossible to identify the true cause of being assigned a positive
class using data mining, discrimination in data mining cannot be avoided without
introducing positive discrimination. When applying data mining, one thus has to
make a choice between positive and negative discrimination. In our opinion, using
the assumption of equal treatment in a well-thought-out way is a lesser evil than
blindly applying a possibly discriminating data mining procedure.
This chapter is organized as follows. We start with an introduction to the Naive
Bayes classifier in Section 2. We then use examples to provide arguments in favor
Search WWH ::




Custom Search