Database Reference
In-Depth Information
mining and (group) profiling are techniques that have been used since long, but
with the emergence of new technical possibilities and processing capacities, these
have become the dominant modes of data analyses. Through these techniques, pro-
files of terrorists are created so as to forestall criminal activities, relationships be-
tween specific characteristics and diseases may be discovered so as to prevent them
or treat them in an early stage and business profiles are fine tuned to meet consum-
er interests. However, there are some dangers attached to the use of data mining
and profiling. The two major issues regard privacy and discrimination problems.
Privacy might be in danger when personal data of an individual are gathered,
used to profile him or used in practical decisions and practices. The discrimination
of a particular person or group may occur when personal characteristics, relating
to such information as gender, sexual preferences, political and religious believes
or ethnicity, are gathered, analyzed and used to bestow upon a person or group a
different, disadvantageous treatment. A much used solution in relation to the pri-
vacy aspects, but which may also be of use in relation to discriminatory practices,
is the implementation of so called privacy enhancing technologies. The technical
framework for data processing may be built in such a way that it prevents privacy
and discrimination problems, such as by data minimization, which entails a mini-
mum set of sensitive 1 data gathered, stored and used.
Although data minimization sometimes helps to minimize the scale of danger
or damage, it has several disadvantages as well. First and most prominently, when
valuable data are excluded from the database, it decreases in value and usefulness.
Secondly, by deleting these data, the context in which the information was ga-
thered and had a certain meaning is lost. This chapter will argue that from this loss
of context, a tendency which is inherent to data mining as such but is aggravated
by the use of data minimization principles, problems related to privacy and dis-
crimination arise. Thus, another, opposite approach is suggested, namely that of
data mini mum mization. This principle requires a minimum set of data being ga-
thered, stored and clustered. Instead of requiring that certain data is not collected,
the principle rule of data minimization, the data mini mum mization principles re-
quires that the context of the data in the form of metadata is collected along with
the data. By requiring and clustering a minimum set of (contextual) information,
the value of the dataset is retained or even increased, and the privacy and discrim-
ination problems following from the loss of context might be better addressed than
by the data minimization principle.
This chapter will proceed as follows. The first section will shortly distinguish
four phases of knowledge discovery in databases. The second and third section
will point out some general rules relating to privacy and discrimination, with
which these may come into conflict. The fourth section will put forward one of the
most prominent solutions for these problems, namely that of privacy enhancing
technologies and especially the concept of data minimization. The fifth section
will analyze some of the problems relating to this technique. The sixth section will
offer an alternative solution: data mini mum mization.
1 In this Chapter, the term 'sensitive data' will refer to both privacy and discriminatory sen-
sitive data, unless where indicated.
Search WWH ::




Custom Search