Direct and Indirect Discrimination Prevention Methods - Discrimination and Privacy in the Information Society

Database Reference

In-Depth Information

non-discriminatory attributes ) is A s - DA s and nDI s ( i.e. set of non-

discriminatory itemsets ) is I s - DI s . An example of non-discriminatory itemset

could be {Zip= 10451, City=NYC}.

•

The negated itemset , i.e. ~ X is an itemset with the same attributes as X , but such

that the attributes in ~ X take any value except those taken by attributes in X . In

this chapter, we use the ~ notation for itemsets with binary or categorical

attributes. For a binary attribute, e.g. {Foreign worker=Yes/No}, if X is {For-

eign worker=Yes}, then ~ X is {Foreign worker=No}. Then, if X is binary, it

can be converted to ~ X and vice versa. However, for a categorical (non-binary)

attribute, e.g. {Race=Black/White/Indian}, if X is {Race=Black}, then ~ X is

{Race=White} or {Race=Indian}. In this case, ~ X can be converted to X with-

out ambiguity, but the conversion of X into ~ X is not uniquely defined, which

we denote by ~ X

X . In this chapter, we use only non-ambiguous negations.

13.2.2 Direct and Indirect Discriminatory Rules

As more precisely discussed in Chapter 5, frequent classification rules fall into

one of the following two classes: 1) A classification rule ( r : X → C ) with negative

decision (e.g. denying credit or hiring) is potentially discriminatory (PD) if X ∩

DI s ≠

Ø , otherwise r is potentially non-discriminatory (PND). For example, if DI s

= {Foreign worker=Yes}, a classification rule {Foreign worker=Yes;

City=NYC} → Hire=No is PD, whereas {Zip=10451, City=NYC} → Hire=No, or

{Experience=Low; City=NYC} → Hire=No are PND.

The word “potentially” means that a PD rule could probably lead to discrimina-

tory decisions, hence some measures are needed to quantify the direct discrimina-

tion potential. Also, a PND rule could lead to discriminatory decisions in combi-

nation with some background knowledge; e.g. , if the premise of the PND rule

contains the zipcode as attribute and one knows that zipcode 10451 is mostly in-

habited by foreign people. Hence, measures are needed to quantify the indirect

discrimination potential as well.

As mentioned before, Pedreschi et al. (2008) and Pedreschi et al. (2009a) trans-

lated qualitative discrimination statements in existing laws, regulations and legal

cases into quantitative formal counterparts over classification rules and they intro-

duced a family of measures over PD rules (for example elift ) for direct discrimina-

tion discovery and over PND rules (for example elb ) for indirect discrimination

discovery. Then, by thresholding elift it can be assessed whether the PD rule has

direct discrimination potential. Based on this measure ( elift ), a PD rule ( r: X →C )

is said to be discriminatory if elift(r) ≥ α 1 or protective if elift(r) < α. In addition,

whether the PND rule has indirect discrimination potential can be assessed by

thresholding elb . Based on this measure ( elb ), a PND rule ( r': X →C ) is said to be

redlining if elb(r') ≥ α or non-redlining (legitimate) if elb(r') < α. For more de-

tailed information and definitions of these measures, see Chapter 5.

1 Note that α is a fixed threshold stating an acceptable level of discrimination according to

laws and regulations. For example, the four-fifths rule of U.S. Federal Legislation sets

α=1.25.

Discrimination and Privacy in the Information Society

Search WWH ::

Custom Search

Home