Big Data and Cloud Culture - To the Cloud: Big Data in a Turbulent World

Database Reference

In-Depth Information

144). This comment takes us to correlation , the key technique for draw-

ing quantitative conclusions through big-data analysis, whether it is the

relationship of a ticket price to an actor's tears or between search terms

and the spread of lu.

As a sociologist, I am very familiar with both the magic and the danger

of the correlation. As a graduate student in the 1970s I can recall turning

in punch cards and receiving printouts that appeared magical because they

provided me with a series of correlations and conidence levels (measures

of statistical signiicance) that, even armed with my statistics textbook,

once took hours to complete. This gave me the irst small taste of what

a mainframe computer could do, but it was still within the realm of my

own computational powers. More of a leap came in the 1980s when, with

another colleague, I launched my own major research project based on a

national survey of telephone workers in Canada (Mosco and Zureik 1987).

For this, the variables multiplied exponentially and so were far beyond

manual calculations. But there they were, hundreds of correlations that

brought together demographic data on the workforce, everything from

age to job category, with attitudes about the work, workmates, surveil-

lance, and the technology that was taking over more and more of the labor

process. This appeared to be even more magical because computers were

now doing something that I could not even conceivably accomplish on

my own. While not exactly the stuff of today's big-data studies, because

we relied on a national sample rather than a complete population, it gave

me the irst feeling of what it was like to review a printout whose numbers

appeared to speak to me. But it did not take long, especially because the

senior member of our team was an experienced hand, to understand that

much of what I was looking at was of our own construction. We set up

and deined the variables, creating them out of our own theoretical vision

that established what mattered most in our view—the impact of electronic

surveillance on job satisfaction. As the popular (and very successful) data

analyst Nate Silver explained, “The numbers have no way of speaking

for themselves. We speak for them. We imbue them with meaning.” Any

other view is “badly mistaken” (Asay 2013). That became abundantly

clear when I realized that most of what was spoken, whoever was doing

the talking, was gibberish or, what Silver and others call noise (Silver

2012). That was primarily because most of the correlations we found,

however strong, were spurious or irrelevant; that is, the relationship found

Search WWH ::

Custom Search

Home