Database Reference
In-Depth Information
non-discriminatory attributes ) is A s - DA s and nDI s ( i.e. set of non-
discriminatory itemsets ) is I s - DI s . An example of non-discriminatory itemset
could be {Zip= 10451, City=NYC}.
The negated itemset , i.e. ~ X is an itemset with the same attributes as X , but such
that the attributes in ~ X take any value except those taken by attributes in X . In
this chapter, we use the ~ notation for itemsets with binary or categorical
attributes. For a binary attribute, e.g. {Foreign worker=Yes/No}, if X is {For-
eign worker=Yes}, then ~ X is {Foreign worker=No}. Then, if X is binary, it
can be converted to ~ X and vice versa. However, for a categorical (non-binary)
attribute, e.g. {Race=Black/White/Indian}, if X is {Race=Black}, then ~ X is
{Race=White} or {Race=Indian}. In this case, ~ X can be converted to X with-
out ambiguity, but the conversion of X into ~ X is not uniquely defined, which
we denote by ~ X
X . In this chapter, we use only non-ambiguous negations.
13.2.2 Direct and Indirect Discriminatory Rules
As more precisely discussed in Chapter 5, frequent classification rules fall into
one of the following two classes: 1) A classification rule ( r : X C ) with negative
decision (e.g. denying credit or hiring) is potentially discriminatory (PD) if X ∩
DI s
Ø , otherwise r is potentially non-discriminatory (PND). For example, if DI s
= {Foreign worker=Yes}, a classification rule {Foreign worker=Yes;
City=NYC} Hire=No is PD, whereas {Zip=10451, City=NYC} Hire=No, or
{Experience=Low; City=NYC} Hire=No are PND.
The word “potentially” means that a PD rule could probably lead to discrimina-
tory decisions, hence some measures are needed to quantify the direct discrimina-
tion potential. Also, a PND rule could lead to discriminatory decisions in combi-
nation with some background knowledge; e.g. , if the premise of the PND rule
contains the zipcode as attribute and one knows that zipcode 10451 is mostly in-
habited by foreign people. Hence, measures are needed to quantify the indirect
discrimination potential as well.
As mentioned before, Pedreschi et al. (2008) and Pedreschi et al. (2009a) trans-
lated qualitative discrimination statements in existing laws, regulations and legal
cases into quantitative formal counterparts over classification rules and they intro-
duced a family of measures over PD rules (for example elift ) for direct discrimina-
tion discovery and over PND rules (for example elb ) for indirect discrimination
discovery. Then, by thresholding elift it can be assessed whether the PD rule has
direct discrimination potential. Based on this measure ( elift ), a PD rule ( r: X →C )
is said to be discriminatory if elift(r) α 1 or protective if elift(r) < α. In addition,
whether the PND rule has indirect discrimination potential can be assessed by
thresholding elb . Based on this measure ( elb ), a PND rule ( r': X →C ) is said to be
redlining if elb(r') ≥ α or non-redlining (legitimate) if elb(r') < α. For more de-
tailed information and definitions of these measures, see Chapter 5.
1 Note that α is a fixed threshold stating an acceptable level of discrimination according to
laws and regulations. For example, the four-fifths rule of U.S. Federal Legislation sets
α=1.25.
Search WWH ::




Custom Search