Database Reference
In-Depth Information
2008; Slobogin, 2008). These are searches which are not driven by a specific
individual whom generates interest or suspicion. Rather, they focus on events,
which lead to identifying patterns of behavior describing them. These patterns are
later used to lead back to individuals whom pose greater risks, based on previous
occurrences. Data mining methods require analysts to define specific parameters,
and thereafter the software itself sifts through data and points out trends within
relevant datasets.
While the automated nature of this process generates great public interest,
human discretion still plays an important role. Analysts carry out extensive tasks
at all stages of the analyses process (Zarsky, 2012). The dataset must be actively
constructed, at times by bringing together data from many sources (Ramasastry,
2004). This task requires various decisions, such as which databases should be
used and how specific attributes are to be matched. Other decisions are more
subtle, such as how to define a parameter, and what counts as an “event” which
will trigger further analysis. Next, the analysts play an active role in defining the
parameters of the actual data mining analysis and the creation of clusters, links
and decision trees which are later applied (Zarsky, 2002-3). This is done both in
advance, and after the fact, by “weeding” out results the analyst might consider as
random, wrong or insignificant. Thus, while the process is seemingly
computerized and automated, analysts have ample opportunity to leave an
ideological (and potentially, hidden) impression on the process (Friedman and
Nissenbaum, 1997).
In addition, applying data mining models calls for several subtle yet important
policy decisions which can impact the entire process. These decisions are rarely
made public. For instance, note the setting of the acceptable level of false
negatives in the predictive process. False negatives refer to the inability of the data
mining analysis to correctly reveal instances in which the sought event transpires.
They result from a broad and diverse mix of factors and are very difficult to
establish.
Another, more subtle, policy decision focuses on interpretation. Thus far, we
have described data mining as a process which reveals mere correlations. Data
mining might point to individuals and events, indicating elevated risk, without
telling us why they were selected. However, the definition quoted above describes
data mining, among others, as a process that is “ultimately understandable.” The
level of understanding the data mining process provides relates to whether this
process is interpretable or non-interpretable . Data mining can enable non-
interpretable processes. In such a case, the reasons for the decisions the algorithm
leads to are not explainable in human language. The software makes its decisions
based upon multiple variables. Here, the role of the analyst is minimized. The lack
of interpretation not only reflects on the role of the analysts, but also on the
possible feedback available to individuals affected by the data mining process. It
would be difficult for the government to provide a detailed response when asked
why an individual received differentiated treatment. The government might be
Search WWH ::




Custom Search