Database Reference
In-Depth Information
15.2 Data Mining and Profiling Techniques
Data mining is commonly used as an umbrella concept for knowledge discovery in
databases, though more correct, it is only one of several phases. 2 The first step of
knowledge discovery in databases is the gathering of data. Gathering information
may be done for example through fieldwork, queries, harvesting the internet and
personal observations, but also through interconnecting databases and merging
them together. Secondly, storing the data and organizing the material. The latter
may be necessary not only in relation to making it computer readable, but also to
enable correct analyses of the data and making them comparable. The third phase
is that of actual data mining. Data mining refers to the discovery, most commonly
with the use of (mathematical) algorithms, of hidden patterns and subtle relation-
ships in data and the inference of rules that allow for the prediction of future re-
sults. 3 The patterns and relationships need not to be causal, but may also be statis-
tical. Also, these patterns may be indirect, so that the direct relationship between
for example race and solvency is be replaced by the relationship between a racial-
ly determined zip code and solvency. This is called redlining or masking. 4 The
final stage in the process is applying the knowledge and patterns in real life deci-
sions. This is often done with the assistance of either individual or group profiles. 5
A pattern obtained through data mining will commonly show the probability that
characteristic A is combined with characteristic B. For example, it might be dis-
covered that 67% of the people with curly hair use hair products to style their
hairdo or that 86% of the people having a certain zip code possess an expensive
car. Thus, targeting such groups most commonly entails a certain margin of error.
15.3 Data Protection Legislation
Knowledge discovery in databases may among others come into conflict with two
legal values: privacy and equality. To provide for some basic fundaments for
assessing the (il)legality of such practices, this section will address the topic of
privacy and data protection legislation, the next one will do so with regard to anti-
discrimination laws. The main focus will be on European legislation.
Privacy refers to the right to respect for one's private and family life, home and
communications, while data protection refers to the right to the protection of per-
sonal data concerning a person. The right to privacy is most prominently protected
by the European Convention on Human Right and is a moral concept, seen as in-
strumental in relation to the realisation of autonomy, negative freedom and dig-
nity. If these values are violated or endangered, for example through the use of
data mining, then this practice is prohibited unless it is prescribed by law, it is
necessary in a democratic society and the infringement is proportional in relation
to the goal it serves.
2 Custers (2004); Skillicorn (2009); Westphal (2009); Larose (2006).
3 <http://www.gao.gov/new.items/d07293.pdf>.
4 Squires (2003); Kuhn (1987); LaCour-Little (1999).
5 Hildebrandt & Gutwirth (2008).
Search WWH ::




Custom Search