Database Reference
In-Depth Information
items, denoting groups of people that could be potentially discriminated. Given a
classification rule SEX = FEMALE , CAR = OWN
CREDIT = NO , it is straightforward
to separate in its premise SEX = FEMALE from CAR = OWN , in order to reason about
potential discrimination against women with respect to people owning a car.
However, discrimination typically occurs for subgroups rather than for the whole
group (the US courts coined the term “gender-plus allegations” to describe con-
ducts breaching the law on the ground of sex-plus-something-else), or it may occur
for multiple causes (called multiple discrimination in ENAR, 2007). For instance,
we could be interested in discrimination against older women. With our syntax, this
group would be represented as the itemset SEX = FEMALE , AGE = OLDER . The inter-
section of two disadvantaged minorities (here, SEX = FEMALE and AGE = OLDER )is
a, possibly empty, smaller (even more disadvantaged) minority as well. As a con-
sequence, we generalize the notion of potentially discriminatory item to the one
of potentially discriminatory (PD) itemset , and assume that the downward closure
property holds for PD itemsets (Ruggieri et al., 2010a).
Definition 1. If A 1 and A 2 are PD itemsets, then A 1 ,
A 2 is a PD itemset as well.
On the technical side, the downward closure property is a sufficient condition for
separating PD itemsets in the premise of a classification rule, namely, there is only
one way A
,
B of splitting the premise of a rule into a PD itemset A and a PND
itemset B .
Definition 2. A classification rule A
C is called potentially discriminatory (PD
rule) if A is non-empty, and potentially non-discriminatory (PND rule) otherwise.
,
B
PD rules explicitly state conclusions involving potentially discriminated groups. PD
rules cannot be extracted from datasets that do not contain potentially discriminatory
items. In such a case, PND rules can still indirectly unveil discriminatory practices
(see Section 5.4).
Let us consider now how to quantitatively measure the “burden” imposed on such
groups and unveiled by a discovered PD rule. Unfortunately, there is no uniformity
nor general agreement on a standard quantification of discrimination by legisla-
tions. A general principle mentioned by (Knopff, 1986) is to consider group under-
representation as a quantitative measure of the qualitative requirement that people
in a group are treated “less favorably” (see European Union Legislation, 2011; U.K.
Legislation, 2011) than others, or such that “a higher proportion of people without
the attribute comply or are able to comply” (see Australian Legislation, 2011) to a
qualifying criterium. We recall from (Ruggieri et al., 2010a) the notion of extended
lift 1 , a measure of the increased confidence in concluding an assertion C resulting
from adding (potentially discriminatory) information A to a rule B
C where no
PD itemset appears.
1
The term “extended lift” originates from the fact that it conservatively extends the well-
known measure of lift (or interest factor ) of an association rule (Tan et al., 2004), which
is obtained, as a special case, when B empty. Conversely, the extended lift of A , B C
corresponds to the lift of A C over the set of transactions supporting B .
Search WWH ::




Custom Search