Database Reference
In-Depth Information
non-discriminatory
attributes
) is
A
s
-
DA
s
and
nDI
s
(
i.e.
set of
non-
discriminatory
itemsets
) is
I
s
- DI
s
. An example of non-discriminatory itemset
could be {Zip= 10451, City=NYC}.
•
The
negated itemset
,
i.e.
~
X
is an itemset with the same attributes as
X
, but such
that the attributes in ~
X
take any value except those taken by attributes in
X
. In
this chapter, we use the ~
notation for itemsets with binary or categorical
attributes. For a binary attribute,
e.g.
{Foreign worker=Yes/No}, if
X
is {For-
eign worker=Yes}, then ~
X
is {Foreign worker=No}. Then, if
X
is binary, it
can be converted to ~
X
and vice versa. However, for a categorical (non-binary)
attribute,
e.g.
{Race=Black/White/Indian}, if
X
is {Race=Black}, then ~
X
is
{Race=White} or {Race=Indian}. In this case, ~
X
can be converted to
X
with-
out ambiguity, but the conversion of
X
into ~
X
is not uniquely defined, which
we denote by ~
X
X
. In this chapter, we use only non-ambiguous negations.
13.2.2 Direct and Indirect Discriminatory Rules
As more precisely discussed in Chapter 5, frequent classification rules fall into
one of the following two classes: 1) A classification rule (
r
:
X
→
C
) with negative
decision (e.g. denying credit or hiring) is potentially discriminatory (PD) if
X ∩
DI
s
≠
Ø
, otherwise
r
is potentially non-discriminatory (PND). For example, if
DI
s
= {Foreign worker=Yes}, a classification rule {Foreign worker=Yes;
City=NYC}
→
Hire=No is PD, whereas {Zip=10451, City=NYC}
→
Hire=No, or
{Experience=Low; City=NYC}
→
Hire=No are PND.
The word “potentially” means that a PD rule could probably lead to discrimina-
tory decisions, hence some measures are needed to quantify the direct discrimina-
tion potential. Also, a PND rule could lead to discriminatory decisions in combi-
nation with some background knowledge;
e.g.
, if the premise of the PND rule
contains the zipcode as attribute and one knows that zipcode 10451 is mostly in-
habited by foreign people. Hence, measures are needed to quantify the indirect
discrimination potential as well.
As mentioned before, Pedreschi
et al.
(2008) and Pedreschi
et al.
(2009a) trans-
lated qualitative discrimination statements in existing laws, regulations and legal
cases into quantitative formal counterparts over classification rules and they intro-
duced a family of measures over PD rules (for example
elift
) for direct discrimina-
tion discovery and over PND rules (for example
elb
) for indirect discrimination
discovery. Then, by thresholding
elift
it can be assessed whether the PD rule has
direct discrimination potential. Based on this measure (
elift
), a PD rule (
r: X →C
)
is said to be
discriminatory
if
elift(r)
≥
α
1
or
protective
if
elift(r) < α.
In addition,
whether the PND rule has indirect discrimination potential can be assessed by
thresholding
elb
. Based on this measure (
elb
), a PND rule (
r': X →C
) is said to be
redlining
if
elb(r') ≥ α
or
non-redlining (legitimate)
if
elb(r') < α.
For more de-
tailed information and definitions of these measures, see Chapter 5.
1
Note that α is a fixed threshold stating an acceptable level of discrimination according to
laws and regulations. For example, the four-fifths rule of U.S. Federal Legislation sets
α=1.25.
Search WWH ::
Custom Search