Database Reference
In-Depth Information
seemingly equal individuals are treated differently. Here, however, we are dealing
with a more nuanced situation - individuals are found to be different based on
various statistical analyses. If we assume for the sake of this analysis (and this is
no trivial assumption) that the data used is correct and the statistical models valid,
we are not addressing situations in which equals are treated differently (as the
analysis itself indicates that the individuals are, in fact, different and not equal).
Rather, we are referring to situations in which individuals are distinguished from
each other on the basis of factors, which might be mathematically correct, yet are
rendered normatively irrelevant by society. Or, we are concerned with various
negative social outcomes which tend to follow from discriminatory practices, such
as stigma, stereotyping and the social seclusion of the group subjected to
discrimination.
To unpack these difficult questions and bring them into the context of data
mining and automated prediction, we must note two very different forms of
discrimination-based problems which might arise. First, a discussion of
discrimination in this context quickly leads to considering the relevance of racial
discrimination and other repugnant practices of the past. In other words, novel
predictive models can prove to be no more than sophisticated tools to mask the
"classic" forms of discrimination of the past, by hiding it behind scientific findings
and faceless processes. It is possible that data mining will inadvertently use
proxies for factors which society finds socially unacceptable, such as race, gender,
nationality or religion. This might result from various reasons: a faulty learning
dataset, problematic motivations (or subconscious biases) plaguing those
operating the system, improper assumptions regarding the data and the population,
and other reasons society is only beginning to explore.
In the next few years, law must map out the ways in which discrimination is to
be defined in this context, and how it should be measured and distinguished from
acceptable data practices. Some of the technological measures to do so are already
being discussed in Chapter 12 of this topic. Yet we do hope future analyses of
these matters will follow. In addition, many policy decisions are still unanswered:
at what point should these discriminatory outcomes be measured - after the fact,
or before running the analyses (by testing the data on a sample data set)? Who
would be responsible for establishing whether a process is discriminatory and
what data sources will be required for doing so? For instance, it is possible that to
establish whether a seemingly innocuous analysis process is in fact a proxy for
unacceptable discriminatory practices, a vast amount sensitive information is
required; information indicating whether individuals within the dataset are
members of protected groups. Clearly, obtaining and using such sensitive datasets
leads to complicated privacy problems which require additional thought.
Yet data mining might lead to an additional, second set of discrimination-based
concerns. These novel concerns result from the fact that the data mining processes
might systematically single out individuals and groups. In these cases, the process
could potentially lead to a flurry of novel ethical and legal concerns which society
has yet to consider - concerns that these groups are treated unfairly, or will be
subjected to the detriments of stereotype and seclusion which plagued the weaker
segments of society in the past and now might be transferred to new forms of
Search WWH ::




Custom Search