Direct and Indirect Discrimination Prevention Methods - Discrimination and Privacy in the Information Society

Database Reference

In-Depth Information

Based on this second dimension, discrimination prevention methods fall into

three groups (Ruggieri et al. 2010): pre-processing , in-processing and post-

processing approaches. We next describe these groups:

•

Pre-processing. Methods in this group transform the source data in such a way

that the discriminatory biases contained in the original data are removed so that

no unfair decision rule can be mined from the transformed data; any of the

standard data mining algorithms can then be applied. The pre-processing ap-

proaches of data transformation and hierarchy-based generalization can be

adapted from the privacy preservation literature. Along this line, Kamiran and

Calders (2009), Kamiran and Calders (2010), Hajian et al. (2011a and 2011b)

and Hajian and Domingo-Ferrer (2012) perform a controlled distortion of the

training data from which a classifier is learned by making minimally intrusive

modifications leading to an unbiased dataset.

•

In-processing. Methods in this group change the data mining algorithms in such

a way that the resulting models do not contain unfair decision rules (Calders

and Verwer 2010, Kamiran et al. 2010). For example, an alternative approach

to cleaning the discrimination from the original dataset is proposed in Calders

and Verwer (2010) whereby the non-discriminatory constraint is embedded into

a decision tree learner by changing its splitting criterion and pruning strategy

through a novel leaf re-labeling approach. However, it is obvious that in-

processing discrimination prevention methods must rely on new special-

purpose data mining algorithms; standard data mining algorithms cannot be

used because they ought to be adapted to satisfy the non-discrimination

requirement.

Post-processing. These methods modify the resulting data mining models, in-

stead of cleaning the original dataset or changing the data mining algorithms.

For example, in Pedreschi et al. (2009a), a confidence-altering approach is pro-

posed for classification rules inferred by the rule-based classifier: CPAR (clas-

sification based on predictive association rules) algorithm (Yin et al. 2003).

•

13.4 Types of Pre-processing Discrimination Prevention

Methods

Although some methods have already been proposed for each of the above men-

tioned approaches (pre-processing, in-processing, post-processing), discrimination

prevention stays a largely unexplored research avenue. In this section, we concen-

trate on a group of discrimination prevention methods based on pre-processing

(first dimension) that could deal with direct or indirect discrimination (second di-

mension), because pre-processing has the attractive feature of being independent

of the data mining algorithms and models. More details, algorithms and experi-

mental results on these methods are presented in Hajian et al. (2011a and 2011b)

and Hajian and Domingo-Ferrer (2012). The purpose of all these methods is to

transform the original data DB in such a way as to remove direct or indirect dis-

criminatory biases, with minimum impact on the data and on legitimate decision

Discrimination and Privacy in the Information Society

Search WWH ::

Custom Search

Home