Introducing Positive Discrimination in Predictive Models - Discrimination and Privacy in the Information Society

Database Reference

In-Depth Information

Example

In our spam-example above, the class-independence assumption says that the oc-

currence s of the different words in the email are independent of each other, given

the class. More specifically, in the spam emails, every word has a probability that

it occurs, but all words occur independently; if “a” occurs with 20% probability,

and “the” with 50% probability in spam mails, the probability that both words

occur in the spam email is 20% times 50% = 10%; the only factor that influences

the probability of words occurring is if it is a spam email or not. Obviously this

assumption will be violated in real emails. Nevertheless, many spam filters suc-

cessfully use Naive Bayes classifiers even though they are based upon an unrealis-

tic assumption.

As discussed in much detail in Chapter 3, often there is a need to learn classifiers

that do not discriminate with respect to certain sensitive attributes, e.g., gender,

even though the labels in the training data themselves may represent a discrimina-

tory situation. In Chapters 12 and 13, preprocessing techniques and an adapted

decision tree learner for discrimination-aware classification have already been in-

troduced. In this chapter, we provide three methods to make a Naive Bayes model

discrimination-free:

1. Use different decision thresholds for every sensitive attribute value; e.g., fe-

males need a lower score than man to get the positive label.

2. Learn a different model for every sensitive attribute value and use different de-

cision thresholds.

3. Add an attribute for the actual non-discriminatory class to a specialized Naive

Bayes model and try to learn the actual class values of every row in the data-set

using the expectation-maximization algorithm.

Note, however, that all of the above methods can be seen as a type of positive dis-

crimination: they assume an equal treatment of every sensitive attribute value and

force the predictor to satisfy this assumption, sacrificing predictive accuracy in the

process. Thus, although the off-the-shelf classifier considers it more likely for

some people to be assigned a positive class, they are forcibly assigned a negative

class in order to reduce discrimination, i.e., they are discriminated positively.

Since positive discrimination is considered illegal in several countries, these me-

thods should be applied with care. Applying predictive tools untouched, however,

should also be done with care since they are very likely to be discriminating: they

make use of any correlation in order to improve accuracy, also the correlation be-

tween sensitive and class attributes.

Since it is impossible to identify the true cause of being assigned a positive

class using data mining, discrimination in data mining cannot be avoided without

introducing positive discrimination. When applying data mining, one thus has to

make a choice between positive and negative discrimination. In our opinion, using

the assumption of equal treatment in a well-thought-out way is a lesser evil than

blindly applying a possibly discriminating data mining procedure.

This chapter is organized as follows. We start with an introduction to the Naive

Bayes classifier in Section 2. We then use examples to provide arguments in favor

Discrimination and Privacy in the Information Society

Search WWH ::

Custom Search

Home