Techniques for Discrimination-Free Predictive Models - Discrimination and Privacy in the Information Society

Database Reference

In-Depth Information

If the dataset D would have been unbiased; that is, S and Class were statistically

independent, the expected probability of being non-native and having the positive

class P ex p (

∧ +)

would be:

= |

(

× |

(

Class

)=+ |

P exp (

∧ +)

For instance in the example dataset of Table 12.1, 50% of people are female, and

60% of people have a positive class. Therefore, if the dataset was non-discriminatory,

one would expect also 60% of females to have the positive class, which gives in total

50%

30% of people being female and having the positive class. In reality,

however, the observed probability in D ,

60%

= |

(

∧

(

Class

)=+ |

P obs (

∧ +)

might be different. If the expected probability is higher than the observed probabil-

ity value, it shows the bias towards class '

f .

Continuing the example, in the dataset of Table 12.1, we observe that only 2 people

in the dataset are female and have a positive class label, so the observed probability

of female and positive is 20%, which is considerably lower than the expected 30%,

thus indicating discrimination.

To compensate for the bias, we assign weights to objects. If a particular group

is under-represented, we give members of this group a higher weight, making them

more important in the classifier training process. The weight we assign to an object

is exactly the expected probability divided by the observed probability. In the ex-

ample this would mean that we assign a weight of 30% divided by 20% = 1.5 to

females with a positive class label. In this way we assign a weight to every object

according to its S -and Class -values. We call the dataset D with the added weights,

D W . It can be proven that the resulting dataset D W is unbiased; that is, if we multiply

the frequency of every object by its weight, the discrimination is 0. On this balanced

dataset the discrimination-free classifier is learnt.

Since not every classification algorithm can directly work with weights, we may

also use the weights when resampling the dataset; that is, we randomly select objects

from our training set to form a new dataset. When forming the new dataset, some

objects may be omitted and some may be duplicated. In the sampling procedure, the

weight of an object represents its relative chance of being chosen from the dataset;

that is, an object with a weight of 2.4 in every selection step has a 4 times higher

probability of being chosen than an object with a weight of 0.6. This variant is called

resampling .

−

' for those objects X with X

(

Example 2. Consider again the dataset in Table 12.1. The weight for each data

object is computed according to its S- and Class-value, e.g. for instances with values

(

Sex

f and X

(

Class

)=+

(

Discrimination and Privacy in the Information Society

Search WWH ::

Custom Search

Home