Database Reference
In-Depth Information
(Asay 2013). A Microsoft researcher worries about the uncritical accep-
tance of big-data analysis out of a widespread “big data fundamentalism”
(Hardy 2013i). One source of the fundamentalism is the belief that once
the easy work of gathering data is completed, the data will speak for itself,
yielding proitable gold nuggets of business information. But this is far
from the case. Analysis is the hard part and it is growing more challeng-
ing as the amount of collectible data expands. It is no wonder that some
experts worry that businesses are giving up on big data, leading one to
conclude that a “dirty little secret” of the industry is that “nobody wants
to use the data” (Elowitz 2013). Before examining what might appro-
priately be called the big-data sublime, it is best to briely examine what
the fuss is about.
Although in application big-data analysis can be a very challenging
exercise, its fundamentals are much less complicated than one might
expect. Analysts take sets of quantitative data and run correlations to ind
relationships that yield insights, perhaps anticipated, perhaps not, and they
use these indings to make predictions. Let's consider the four important
elements in this description. First, the data under analysis are invariably
quantitative in that operations are applied to numerical values of objects,
events, outcomes, ideas, opinions, etc. This does not mean that big data
avoids qualitative information, but rather that analysts represent subjective
states with quantities—for example, by assigning numerical values to likes
and dislikes or to feelings of satisfaction or dissatisfaction.
Second, big data develops generalizations based on correlations among
variables. According to two big-data specialists, this means internalizing “a
growing respect for correlations rather than a continuing quest for elusive
causality” (Mayer-Schönberger and Cukier 2013, 19). Such analysis might
lead to the conclusion that a voter's age is closely related to support for the
president. Speciically, as age increases, support decreases. Correlational
analysis can measure the direction of a relationship, positive or negative,
and the strength of that relationship. But it cannot say anything, by itself,
about causality or even about whether a relationship is genuine or spurious.
One cannot, from the data itself, determine whether two variables that are
positively related are also causally related—their relationship may be caused
by another, yet unrecognized, variable or, worse, their relationship may be
a igment of the data and the variables actually have nothing to do with
one another. Even correlations achieved at a high level of signiicance—for
Search WWH ::




Custom Search