Database Reference
In-Depth Information
Metrics to measure the success at removing discriminatory rules are given in Sec-
tion 13.5. Data quality metrics are listed in Section 13.6. Section 13.7 contains
experimental results for the direct discrimination prevention methods proposed.
Conclusions and suggestions for future work are summarized in Section 13.8.
13.2 Preliminaries
In this section we briefly recall some basic concepts which are useful to better un-
derstand the study presented in this chapter.
13.2.1 Basic Notions
A dataset is a collection of data objects (records) and their attributes. Let DB be
the original dataset.
An item is an attribute along with its value, e.g. {Race=black}.
An itemset , i.e. X , is a collection of one or more items, e.g. {Foreign work
er=Yes, City=NYC}.
A classification rule is an expression X
C , where C is a class item (a yes/no
decision), and X is an itemset containing no class item, e.g. {Foreign work-
er=Yes, City=NYC}
{hire=no}. X is called the premise of the rule.
The support of an itemset, supp(X) , is the fraction of records that contain the
itemset X . We say that a rule X
C is completely supported by a record if both
X and C appear in the record.
The confidence of a classification rule, conf(X
C) , measures how often the
class item C appears in records that contain X . Hence, if supp(X) > 0
C) = supp(X,C)
supp(X)
conf(X
1. Support and confidence range over [0,1] .
A frequent classification rule is a classification rule with a support or confi-
dence greater than a specified lower bound. Let FR be the database of frequent
classification rules extracted from DB .
Discriminatory attributes and itemsets (protected by law): Attributes are classi-
fied as discriminatory according to the applicable anti-discrimination acts
(laws). For instance, U.S. federal laws prohibit discrimination on the basis of
the following attributes: race, color, religion, nationality, sex, marital status,
age and pregnancy (Pedreschi et al. 2008). Hence these attributes are regarded
as discriminatory and the itemsets corresponding to them are called discrimina-
tory itemsets. {Gender=Female, Race=Black} is just an example of a discrimi-
natory itemset. Let DA s be the set of predetermined discriminatory attributes in
DB and DI s be the set of predetermined discriminatory itemsets in DB.
Non-discriminatory attributes and itemsets : If A s is the set of all the attributes
in DB and I s the set of all the itemsets in DB, then nDA s ( i.e. set of
Search WWH ::




Custom Search