Explainable and Non-explainable Discrimination in Classification - Discrimination and Privacy in the Information Society

Database Reference

In-Depth Information

The local preferential sampling applies the same principles of preferential sam-

pling (Kamiran and Calders, 2010) but now locally to partitions of the data. It mod-

ifies and controls the number of accepted male and female, to ensure no redlining.

The procedure for local preferential sampling is presented in Figure 8.7.

Fig. 8.7 Local preferential sampling

8.4.2

Computational Experiments

In this section we demonstrate the performance of the local discrimination handling

techniques on real world datasets. The objective is to minimize the absolute value

of the illegal discrimination while keeping the accuracy as high as possible. It is

important not to overshoot and end up with a reverse discrimination.

Data

For our experiments we use two real datasets. The Adult dataset comes from UCI

(Asuncion and Newman, 2007), the task is to classify individuals into high and low

income classes. Our dataset consists of a uniform sample of 15 696 instances, which

are described by 13 attributes and a class label. Originally 6 of the 13 attributes were

numeric attributes, which we discretized. Gender is the sensitive attribute, income is

the label. We repeat our experiments several times, where any of the other attributes

in turn is selected as explanatory. Figure 8.8 (left) shows the discrimination in the

dataset. The horizontal axis denotes the index of the explanatory attribute.

In the Adult dataset a number of attributes are weakly related with gender (such

as workclass, education, occupation, race, capital loss, native country). Therefore,

nominating any of those attributes as explanatory will not explain much of the dis-

crimination. For instance, knowledge of biology suggests that race and gender are

independent. Thus, race cannot explain the discrimination on gender; that discrim-

ination is either illegal or it is due to some other attributes. Indeed, the plot shows

that all the discrimination is illegal, when treating race (attribute #7) as explanatory.

On the other hand, we observe that the relationship (attribute #6) explains a great

deal of D all . Judging subjectively, the values of this attribute 'wife' and 'husband'

clearly capture the gender information and from the data mining perspective, if we

are allowed to treat it as acceptable, a large part of the discrimination is explained.

Age, and working hours per week are other examples of explanatory attributes that

Search WWH ::

Custom Search

Home