Direct and Indirect Discrimination Prevention Methods - Discrimination and Privacy in the Information Society

Database Reference

In-Depth Information

Congress 1994), the UK Sex Discrimination Act (Parliament of the United King-

dom 1975) and the UK Race Relations Act (Parliament of the United Kingdom

1976). There are several decision-making tasks which lend themselves to discrim-

ination, e.g. loan granting, education, health insurances and staff selection. In

many scenarios, decision-making tasks are supported by information systems.

Given a set of information items on a potential customer, an automated system de-

cides whether the customer is to be recommended for a credit or a certain type of

life insurance. Automating such decisions reduces the workload of the staff of

banks and insurance companies, among other organizations. The use of informa-

tion systems based on data mining technology for decision making has attracted

the attention of many researchers in the field of computer science. In consequence,

automated data collection and a plethora of data mining techniques such as associ-

ation/classification rule mining have been designed and are currently widely used

for making automated decisions.

At first sight, automating decisions may give a sense of fairness: classification

rules (decision rules) do not guide themselves by personal preferences. However,

at a closer look, one realizes that classification rules are actually learned by the

system based on training data. If the training data are inherently biased for or

against a particular community (for example, foreigners), the learned model may

show a discriminatory prejudiced behavior. For example, in a certain loan granting

organization, foreign people might systematically have been denied access to

loans throughout the years. If this biased historical dataset is used as training data

to learn classification rules for an automated loan granting system, the learned

rules will also show biased behavior toward foreign people. In other words, the

system may infer that just being foreign is a legitimate reason for loan denial. A

more detailed analysis of this fact is provided in Chapter 3.

Figure 13.1 illustrates the process of discriminatory and non-discriminatory de-

cision rule extraction. If the original biased dataset DB is used for data analysis

without any anti-discrimination process ( i.e. discrimination discovery and preven-

tion), the discriminatory rules extracted could lead to automated unfair decisions.

On the contrary, DB can go through an anti-discrimination process so that the

learned rules are free of discrimination, given a list of discriminatory attributes

( e.g. gender, race, age, etc.). As a result, fair and legitimate automated decisions

are enabled.

Despite the wide deployment of information systems based on data mining

technology in decision making, the issue of anti-discrimination in data mining did

not receive much attention until 2008 (Pedreschi et al. 2008). After that, some

proposals have addressed the discovery and measure of discrimination. Others

deal with the prevention of discrimination. The discovery of discriminatory deci-

sions was first proposed by Pedreschi et al. (2008) and Ruggieri et al. (2010). The

approach is based on mining classification rules (the inductive part) and reasoning

on them (the deductive part) on the basis of quantitative measures of discrimina-

tion that formalize legal definitions of discrimination. For instance, the U.S. Equal

Discrimination and Privacy in the Information Society

Search WWH ::

Custom Search

Home