Database Reference
In-Depth Information
Congress 1994), the UK Sex Discrimination Act (Parliament of the United King-
dom 1975) and the UK Race Relations Act (Parliament of the United Kingdom
1976). There are several decision-making tasks which lend themselves to discrim-
ination, e.g. loan granting, education, health insurances and staff selection. In
many scenarios, decision-making tasks are supported by information systems.
Given a set of information items on a potential customer, an automated system de-
cides whether the customer is to be recommended for a credit or a certain type of
life insurance. Automating such decisions reduces the workload of the staff of
banks and insurance companies, among other organizations. The use of informa-
tion systems based on data mining technology for decision making has attracted
the attention of many researchers in the field of computer science. In consequence,
automated data collection and a plethora of data mining techniques such as associ-
ation/classification rule mining have been designed and are currently widely used
for making automated decisions.
At first sight, automating decisions may give a sense of fairness: classification
rules (decision rules) do not guide themselves by personal preferences. However,
at a closer look, one realizes that classification rules are actually learned by the
system based on training data. If the training data are inherently biased for or
against a particular community (for example, foreigners), the learned model may
show a discriminatory prejudiced behavior. For example, in a certain loan granting
organization, foreign people might systematically have been denied access to
loans throughout the years. If this biased historical dataset is used as training data
to learn classification rules for an automated loan granting system, the learned
rules will also show biased behavior toward foreign people. In other words, the
system may infer that just being foreign is a legitimate reason for loan denial. A
more detailed analysis of this fact is provided in Chapter 3.
Figure 13.1 illustrates the process of discriminatory and non-discriminatory de-
cision rule extraction. If the original biased dataset DB is used for data analysis
without any anti-discrimination process ( i.e. discrimination discovery and preven-
tion), the discriminatory rules extracted could lead to automated unfair decisions.
On the contrary, DB can go through an anti-discrimination process so that the
learned rules are free of discrimination, given a list of discriminatory attributes
( e.g. gender, race, age, etc.). As a result, fair and legitimate automated decisions
are enabled.
Despite the wide deployment of information systems based on data mining
technology in decision making, the issue of anti-discrimination in data mining did
not receive much attention until 2008 (Pedreschi et al. 2008). After that, some
proposals have addressed the discovery and measure of discrimination. Others
deal with the prevention of discrimination. The discovery of discriminatory deci-
sions was first proposed by Pedreschi et al. (2008) and Ruggieri et al. (2010). The
approach is based on mining classification rules (the inductive part) and reasoning
on them (the deductive part) on the basis of quantitative measures of discrimina-
tion that formalize legal definitions of discrimination. For instance, the U.S. Equal
Search WWH ::




Custom Search