Techniques for Discrimination-Free Predictive Models - Discrimination and Privacy in the Information Society - page 227

Database Reference

In-Depth Information

12.4

Experiments

The different techniques discussed in this chapter have been experimented with

extensively. We refer the interested reader for the detailed discussion of the ex-

perimental studies and results to (Kamiran et al., 2010b,a; Kamiran & Calders,

2012; Kamiran, 2011). In this section we give an overview of the most impor-

tant empirical results for the Adult dataset. This dataset has 48 842 instances and

contains demographic information of people. The associated prediction task is

to determine whether a person makes over 50K per year or not; that is, income

class High or Low has to be predicted. The other attributes in the dataset in-

clude: age, type of work, education, years of education, marital status, occupation,

type of relationship (husband, wife, not in family), sex, race, native country, cap-

ital gain, capital loss and weekly working hours. We consider Sex as sensitive at-

tribute. In our sample of the dataset, 16 192 citizens have Sex

=

f and 32 650 have

Sex

=

m . The discrimination with respect to Sex

=

m in the historical data is 19.45%:

P

45%.

The goal is to learn a classifier that has minimal discrimination and maintains high

accuracy.

Figure 12.3 shows the result of experiments when we learn decision trees after

applying our proposed discrimination-aware preprocessing techniques on the train-

ing data (label 'Preprocessing'), with discrimination-aware splitting criteria (label

'Learner-adaptation'), with leaf relabeling (label 'Postprocessing'), a Naıve Bayes

model of Chapter 14 of this topic (label '3-NaiveBayes') and learnt without any

discrimination-aware technique (label 'Zero-treatment'). We observe in Figure 12.3

that the discrimination-aware techniques discussed in this chapter reduce the dis-

crimination significantly while maintaining a high accuracy as compared to the

ordinary methods. For instance, a traditional decision tree without using any

discrimination removal method classifies the future data objects with 16.65%

(

X

(

Class

)=+ |

X

(

Sex

)=

m

) −

P

(

X

(

Class

)=+ |

X

(

Sex

)=

f

)=

19

.

87

86

85

84

83

Zero-treatment

Preprocessing

Postprocessing

Learner-adaptation

3-NaiveBayes

82

81

80

-2

0

2

4

6

8

10

12

14

16

18

Discrimination (%)

Fig. 12.3 Comparison of techniques discussed in Section 12.3.1 (label Preprocessing), Sec-

tion 12.3.2 (label Learner-adaptation), Section 12.3.3 (label Postprocessing), Naıve Bayes

model of Chapter 14 (label 3-NaiveBayes), and ordinary methods (label Zero-treatment) over

the Adult dataset.

Next Page

Discrimination and Privacy in the Information Society

Search WWH ::

Custom Search

Home