Data Dilemmas in the Information Society: Introduction and Overview - Discrimination and Privacy in the Information Society

Database Reference

In-Depth Information

respectively. Furthermore, Hajian and Domingo-Ferrer present metrics that can be

used to evaluate the performance of these approaches and show that

discrimination removal can be done at a minimal loss of information.

In Chapter 14, Verwer and Calders show how positive discrimination (also

known as affirmative action) can be introduced in predictive models. Three

solutions based upon so-called Bayesian classifiers are introduced. The first

technique is based on setting different thresholds for different groups. For

instance, if there are income differences between men and women in a database,

men can be given a high income label above $90,000, whereas women can be

given a high income label above $75,000. Instead of income figures, the labels

high and low income could be applied. This instantly reduces the discriminating

pattern. The second techniques focuses on learning two separate models, one for

each group. Predictions from these models are independent of the sensitive

attribute. The third and most sophisticated model is focused on discovering the

labels a dataset should have contained if it would have been discrimination-free.

These latent (or hidden) variables can be seen as attributes of which no value is

recorded in the dataset. Verwer and Calders show how decisions can be reverse

engineered by explicitly modeling discrimination.

1.4.5 Part V: Solutions in Law, Norms and the Market

Part V of this topic provides non-technological solutions to the discrimination and

privacy issues discussed in Part II. These solutions may be found in legislation,

norms and the market. Many of such solutions are discussed in other topics and

papers, such as (to name only a few) the regulation of profiling, 37 criteria for

balancing privacy concerns and the common good, 38 self-regulation of privacy, 39

organizational change and a more academic approach, 40 and valuating privacy in a

consumer market. 41 We do not discuss these suggested solutions in this topic, but

we do add a few other suggested solutions to this body of work.

In Chapter 15, Van der Sloot proposes to use minimum datasets to avoid

discrimination and privacy violations in data mining and profiling. Discrimination

and privacy are often addressed by implementing data minimization principles,

restricting collecting and processing of data. Although data minimization may

help to minimize the impact of security breaches, it has also several disadvantages.

First, the dataset may lose value when reduced to a bare minimum and, second,

the context and meaning of the data may get lost. This loss of context may cause

or aggravate privacy and discrimination issues. Therefore, Van der Sloot suggests

an opposite approach, in which minimum datasets are mandatory. This better

ensures adequate data quality and may prevent loss of context.

In Chapter 16, Finocchiaro and Ricci focus on the opposite of being profiled,

which is building one's own digital reputation. Although people have some

37 See, for instance, Bygrave, L.A. (2002).

38 Etzioni, A. (1999), p. 12/13.

39 Regan, P.M. (2002).

40 See, for instance, Posner, R.A. (2006), p. 210.

41 See, for instance, Böhme (2009) and Böhme and Koble (2007).

Discrimination and Privacy in the Information Society

Search WWH ::

Custom Search

Home