Database Reference
In-Depth Information
discrimination and 86.01% accuracy even though the sensitive attribute was not used
at the prediction time. We observe in our experiments that learning a decision tree
with modified splitting criterion, that is, using the second type of discrimination-
aware classification alone does not significantly reduce the discrimination. However,
when the decision trees are learnt on cleaner data obtained with discrimination-
aware pre-processing techniques, the discrimination is reduced to 3.32% while
keeping the accuracy at 84.44%. The decision trees with leaf relabeling were able
in our experiment to reduce the discrimination to 0% while keeping a reasonably
high accuracy. Figure 12.3 also shows that our proposed methods outperform the
discrimination-aware Naıve Bayes model of Chapter 14 of this topic with respect to
the accuracy-discrimination trade-off.
12.5
Discussion and Conclusion
In this chapter we discussed the idea of discrimination-aware classification and in-
troduced a procedural way to calculate the discrimination in a given dataset and in
the predictions of a classifier. We also discussed three types of techniques to learn
the discrimination-free classifiers which include data preprocessing techniques, an
adapted classifier learning procedure and an approach for postprocessing of a learnt
decision tree by changing the labels of some of its leaves to make the final pre-
dictive model discrimination-free. Finally, we presented empirical validation results
showing that the discrimination-aware classification methods predict labels for the
previously unseen data objects with no or significantly lower discrimination and
with the minimal loss of accuracy.
Depending on the situation one of the proposed techniques may be better than an-
other. First of all, if none of the other attributes is correlated to the sensitive attribute,
clearly it suffices to just remove this attribute. Unfortunately this is seldomly the
case, and even if it is the case, no guarantees can be given that no such correlations
exist. The presented preprocessing techniques have the advantage that they make in-
put data discrimination-free which can then be used by any classification algorithm,
yet have the disadvantage of giving no guarantee about the degree of discrimination
in the final classifier. The model post-processing techniques do not have this dis-
advantage; in principle the postprocessing is continued until a discrimination-free
classifier (on a validation set) is obtained. The model post-processing techniques as
well as the learner adaptation techniques on their turn, however, have the disadvan-
tage of being model and even algorithm specific; for every classifier new algorithms
will have to be invented. In the experiments it was further shown that the learner
adaptation approach did not work as expected, unless it was combined with the
post-processing techniques. This surprising failure calls for more research to better
understand the reasons for it.
Despite of showing some promising results on discrimination-free classifier con-
struction, our study is far from complete. For instance, often there is a much more
complex ecology of attributes than what is assumed in the chapter. In the chapter
Search WWH ::




Custom Search