Database Reference
In-Depth Information
“male”). This setting represents the simplest possible of all situations and marks the
starting point of the recent discrimination-aware research. For a discussion on more
elaborated settings which builds upon this base case, but involves a more complex
ecology of attributes, see Chapter 8 of this topic.
First we motivate the problem of discrimination-free classification by relating it
to existing anti-discrimination laws that prohibit discrimination in housing, employ-
ment, financing, insurance, and wages on the basis of race, color, national origin,
religion, sex, familial status, and disability (Section 12.2.1). For a more in-depth
discussion on anti-discrimination and privacy legislation, we refer the interested
reader to Chapter 4 of this topic. we give a measure for discrimination on which
the problem of classification without discrimination will be based (Section 12.2.2).
Then, we show how to learn accurate classifiers on discriminatory training data that
do not discriminate in their future predictions (Section 12.3). Particularly, we dis-
cuss three types of techniques that lead to discrimination-free classifiers. The three
classes of techniques and where in the classifier learning process they take place is
illustrated in Figure 12.1.
Input
Training data
Learning
Induce classifier
Output
Predictive Model
−→
−→
(Section 3.1)
- Instance relabeling
(Massaging)
- Reweighing
& Resampling
( Chapter 13 )
- Rule hiding
(Section 3.2)
- DA-Decision trees
( Chapter 14 )
- EM for Bayesian nets
(Section 3.3)
- Leaf Relabeling
in decision trees
( Chapter 14 )
- Adjusting thresholds
in Naıve Bayes
Fig. 12.1 Graphical illustration of the three classes of discrimination-free techniques for
classification
The first class of techniques removes the discrimination from the input data, ei-
ther by selectively relabeling some of the instances (we call this massaging ); for
instance, in the example above, some of the unsuccessful females could be labeled
as successful and some of the successful males as unsuccessful, or by resampling
the input data; that is, some of the successful males are removed from the input data,
and some of the successful females' records get duplicated, or by reweighing, that is
assigning higher weights for unsuccessful females and lower weight for successful
males(Calders, Kamiran, & Pechenizkiy,2009; Kamiran & Calders, 2009a). Another
approach that belongs to this class is described in Chapter 13 of this topic; based on
a collection of discriminative rules detected by discrimination discovery techniques
as described in Chapter 5 of this topic, rule hiding techniques from privacy preserv-
ing data mining (Chapter 11 of this topic) are used to suppress the discriminative
rules in the input data.
Search WWH ::




Custom Search