Database Reference
In-Depth Information
12.4
Experiments
The different techniques discussed in this chapter have been experimented with
extensively. We refer the interested reader for the detailed discussion of the ex-
perimental studies and results to (Kamiran et al., 2010b,a; Kamiran & Calders,
2012; Kamiran, 2011). In this section we give an overview of the most impor-
tant empirical results for the Adult dataset. This dataset has 48 842 instances and
contains demographic information of people. The associated prediction task is
to determine whether a person makes over 50K per year or not; that is, income
class High or Low has to be predicted. The other attributes in the dataset in-
clude: age, type of work, education, years of education, marital status, occupation,
type of relationship (husband, wife, not in family), sex, race, native country, cap-
ital gain, capital loss and weekly working hours. We consider Sex as sensitive at-
tribute. In our sample of the dataset, 16 192 citizens have Sex
=
f and 32 650 have
Sex
=
m . The discrimination with respect to Sex
=
m in the historical data is 19.45%:
P
45%.
The goal is to learn a classifier that has minimal discrimination and maintains high
accuracy.
Figure 12.3 shows the result of experiments when we learn decision trees after
applying our proposed discrimination-aware preprocessing techniques on the train-
ing data (label 'Preprocessing'), with discrimination-aware splitting criteria (label
'Learner-adaptation'), with leaf relabeling (label 'Postprocessing'), a Naıve Bayes
model of Chapter 14 of this topic (label '3-NaiveBayes') and learnt without any
discrimination-aware technique (label 'Zero-treatment'). We observe in Figure 12.3
that the discrimination-aware techniques discussed in this chapter reduce the dis-
crimination significantly while maintaining a high accuracy as compared to the
ordinary methods. For instance, a traditional decision tree without using any
discrimination removal method classifies the future data objects with 16.65%
(
X
(
Class
)=+ |
X
(
Sex
)=
m
)
P
(
X
(
Class
)=+ |
X
(
Sex
)=
f
)=
19
.
87
86
85
84
83
Zero-treatment
Preprocessing
Postprocessing
Learner-adaptation
3-NaiveBayes
82
81
80
-2
0
2
4
6
8
10
12
14
16
18
Discrimination (%)
Fig. 12.3 Comparison of techniques discussed in Section 12.3.1 (label Preprocessing), Sec-
tion 12.3.2 (label Learner-adaptation), Section 12.3.3 (label Postprocessing), Naıve Bayes
model of Chapter 14 (label 3-NaiveBayes), and ordinary methods (label Zero-treatment) over
the Adult dataset.
 
Search WWH ::




Custom Search