Techniques for Discrimination-Free Predictive Models - Discrimination and Privacy in the Information Society - page 226

Database Reference

In-Depth Information

part of the figure shows the partitioning induced by the decision tree. For example,

the third leaf in the tree corresponds to all non-native people without a university

diploma. The leaves can hence be seen as non-overlapping “profiles” dividing up the

space of all instances. Every example fits exactly one profile, and with every profile

exactly one class is associated. When a new example needs to be classified by a

decision tree, it is given the majority class label of the region/profile it falls into. If

some of the profiles are very homogeneous with respect to the sensitive attribute; for

instance, containing only members of the deprived community, then this may lead to

discriminative predictions. In l 3 , for instance, two thirds of the instances are from the

deprived community. The relabeling technique now consists of changing the labels

for those regions where this results in the highest reduction in discrimination while

trading in as little accuracy as possible. Conceptually this method corresponds to

merging neighboring regions to form larger, less discriminative profiles. The process

of relabeling continues until the discrimination is removed.

Example 3. Consider the example decision tree given in Figure 12.2. The discrimi-

nation of the decision tree is 20% . Suppose we want to reduce the discrimination to

5% . For each of the leaves it is given how much the discrimination changes (

Δ

disc)

when relabeling the node, and how much the accuracy decreases (

acc). The node

for which the tradeoff between discrimination reduction versus lowered accuracy is

most beneficial, is selected first for relabeling.

Δ

Δ

disc

Node

Δ

acc

Δ

disc

Δ

acc

l 1

−

40%

0%

0

l 2

−

10% 10%

1

l 3

−

30% 10%

1

/

3

In this particular case, the reduction algorithm hence pick l 2 to relabel; that is, the

split on degree is removed and leaves l 2 and l 3 are merged.

12.3.3.2

Related Approaches

The idea of model correction has been explored in different settings, particularly

in cost-sensitive learning, learning from imbalanced data, and context sensitive

or context-aware learning. Concrete examples of model correction include Naive

Bayes prior correction (also in Chapter 14 of this topic) and posterior probabili-

ties correction based on a confusion matrix (Morris & Misra, 2002); nearest neigh-

bor based classification or identification correction based on current context, e.g. in

driver-route identification (Mazhelis, Zliobaite, & Pechenizkiy, 2011) or in context-

sensitive correction of phone recognition output (Levit, Alshawi, Gorin, & Noth,

2003). The tree node relabeling ideas have been used in recognizing textual en-

tailments (Heilman & Smith, 2010) and probabilistic context-free grammar pars-

ing (Johnson, 1998). But these are not related to the idea of decision tree learning.

However, we are not aware of other approaches directly related to the discussed idea

of leaf relabeling in decision trees applicable to our settings.

Next Page

Discrimination and Privacy in the Information Society

Search WWH ::

Custom Search

Home