Database Reference
In-Depth Information
Based on this second dimension, discrimination prevention methods fall into
three groups (Ruggieri et al. 2010): pre-processing , in-processing and post-
processing approaches. We next describe these groups:
Pre-processing. Methods in this group transform the source data in such a way
that the discriminatory biases contained in the original data are removed so that
no unfair decision rule can be mined from the transformed data; any of the
standard data mining algorithms can then be applied. The pre-processing ap-
proaches of data transformation and hierarchy-based generalization can be
adapted from the privacy preservation literature. Along this line, Kamiran and
Calders (2009), Kamiran and Calders (2010), Hajian et al. (2011a and 2011b)
and Hajian and Domingo-Ferrer (2012) perform a controlled distortion of the
training data from which a classifier is learned by making minimally intrusive
modifications leading to an unbiased dataset.
In-processing. Methods in this group change the data mining algorithms in such
a way that the resulting models do not contain unfair decision rules (Calders
and Verwer 2010, Kamiran et al. 2010). For example, an alternative approach
to cleaning the discrimination from the original dataset is proposed in Calders
and Verwer (2010) whereby the non-discriminatory constraint is embedded into
a decision tree learner by changing its splitting criterion and pruning strategy
through a novel leaf re-labeling approach. However, it is obvious that in-
processing discrimination prevention methods must rely on new special-
purpose data mining algorithms; standard data mining algorithms cannot be
used because they ought to be adapted to satisfy the non-discrimination
requirement.
Post-processing. These methods modify the resulting data mining models, in-
stead of cleaning the original dataset or changing the data mining algorithms.
For example, in Pedreschi et al. (2009a), a confidence-altering approach is pro-
posed for classification rules inferred by the rule-based classifier: CPAR (clas-
sification based on predictive association rules) algorithm (Yin et al. 2003).
13.4 Types of Pre-processing Discrimination Prevention
Methods
Although some methods have already been proposed for each of the above men-
tioned approaches (pre-processing, in-processing, post-processing), discrimination
prevention stays a largely unexplored research avenue. In this section, we concen-
trate on a group of discrimination prevention methods based on pre-processing
(first dimension) that could deal with direct or indirect discrimination (second di-
mension), because pre-processing has the attractive feature of being independent
of the data mining algorithms and models. More details, algorithms and experi-
mental results on these methods are presented in Hajian et al. (2011a and 2011b)
and Hajian and Domingo-Ferrer (2012). The purpose of all these methods is to
transform the original data DB in such a way as to remove direct or indirect dis-
criminatory biases, with minimum impact on the data and on legitimate decision
Search WWH ::




Custom Search