Database Reference
In-Depth Information
Ta b l e 5 . 1 The German credit case study: attributes (top) and an excerpt of the dataset
(bottom)
Attributes
on personal properties: checking account status, duration, savings status, property
magnitude, type of housing
on credits: credit history, credit request purpose, credit request amount, installment
commitment, existing credits, other parties, other payment
on employment: job type, employment since, number of dependents, own telephone
on personal status: personal status and gender, age, resident since, foreign worker
Decision
CLASS , with values GOOD (grant credit) and BAD (deny credit)
Potentially discriminatory (PD) items
PERSONAL STATUS = FEMALE (female)
AGE = GT 52 (senior people)
FOREIGN WORKER = YES (foreign workers)
PURPOSE CREDIT AMNT HOUSING ... CLASS
PERS STATUS
AGE
JOB
female
gt 52
self emp
new car
lt 38 k
rent
...
bad
male married 30 to 41
unemp
used car
39k to 75 k
own
...
good
male single
42 to 51
skilled
business
75k to 111k
for free
...
good
female
gt 52
unemp
furniture
lt 38 k
own
...
bad
...
...
...
...
...
...
...
...
5.2
Classification Rules for Discrimination Discovery
As a running example throughout the chapter, we refer to the public domain Ger-
man credit dataset, publicly available from the UCI repository of machine learning
datasets (Newman, Hettich, Blake, & Merz, 1998). The dataset consists of 1000
records over bank account holders. It includes 20 nominal (or discretized) attributes
as shown in Table 5.1. The decision attribute takes values representing the good/bad
creditor classification of the bank account holder.
5.2.1
Classification Rules
Given a relation with n attributes, we refer to an item as an expression a
v ,where
a is an attribute and v one of its possible values. For example PERSONAL STATUS
= MALE SINGLE is an item for the German credit dataset. One of the attributes is
taken as the class attribute, i.e., the attribute referring to the decision. In our running
example, the class is named CLASS and the two possible items are CLASS = GOOD ,
that is credit is granted, and CLASS = BAD , that is credit is denied.
A transaction T is a set of items, one for each attribute of the relation. Intuitively,
a transaction is the set of items corresponding to a row of a table. By an itemset X we
mean a set of items, and we say that a transaction T supports an itemset X if every
item in X belongs to T as well, in symbols X
=
T . As an example, the transaction
corresponding to the first row in Table 5.1 supports the itemset PERSONAL STATUS
Search WWH ::




Custom Search