Database Reference
In-Depth Information
exercise is reduced'. 32 Secondly, knowledge discovery in databases in general and
data minimization in particular undermines the context in which data play a role
and have a certain meaning, which may create or aggravate (the risk of) privacy
violations and discriminatory practices.
Firstly, to retain the value and the meaning of the data, the data itself should be
correct and accurate. This may also entail the inclusion of contextual information.
However, this principle is often undermined in knowledge discovery in databases,
among others since a margin of error is commonly accepted. 33 It also involves a
simplification and a decontextualization of reality, since an analysis of few but de-
termining categories is often easier, yields to more direct an concrete correlations
and is thus more valuable, then a model which tries to approximate reality's com-
plexity. 34 Last but not least, there are costs involved with accurate and complete
data gathering, costs which not all parties involved in data mining are willing to
bear because a particular threshold in reliability is often sufficient.
Secondly, the data should be updated so that changed facts or changed contexts
are incorporated in the database. Typically however, data mining and profiling are
used to predict the behavior of people on the bases of old information. Further-
more, when storing the data, one or more of four weaknesses commonly occurs.
'The data may be incomplete, missing fields or records. It may be incorrect, in-
volving non-standard codes, incorrect calculations, duplication, linkage to the
wrong individual or other mistaken inputting; the initial information provided may
have been incorrect. It may be incomprehensible, involving (for example) bad
formatting or the inclusion of multiple fields in one field. It may be inconsistent,
involving overlapping codes or code meanings that change over time. Further-
more, even if data is recorded accurately and properly, different databases may use
different formatting standards, making data sharing or the "interoperability" of dif-
ferent databases difficult.' 35
Thirdly, to retain the value and the meaning of the data, the context of data
should be preserved in the process of data analyses and mining. However, harvest-
ing different databases or merging databases together, which is often the case with
regard to data mining, may give rise to a problem. '[W]hen data is used in a new
context, it may not be interpreted in the same way as previously used, because the
new party using the data may not understand how the data was originally classi-
fied.' 36 By using data for reasons and purposes not envisaged when gathered, data
may be taken and judged out of context. For example, the '[] data which circulate
on the web were “issued” by people concerned with a precise objective, or in a par-
ticular context. The exchanges of data of all kinds and the possibilities to use search
engines with any key words engender the risk that we be judged “out of context”.
[This also refers to] the question of contextual integrity; the person provides his/her
32 Schermer (2011), p. 49.
33 Ramasastry (2006).
34 Larose (2006), p. 1-2.
35 Renke (2006), p. 791-792.
36 Ramasastry (2006), p. 778.
Search WWH ::




Custom Search