Databases Reference
In-Depth Information
periods, as samples are transported. A routinized check comparing samples to field
sheets occurs at each end of each trip: from field site to Baltimore labs, or from
Maryland to Milbrook.
There are others bits of the river that are shed along the way, never making it to the
main database. For instance, conductivity is another measure taken right at the field
site. This measure is recorded on the field sheets in the second to last column, but it
travels no further along the chain. Conductivity accompanies our scientists back to the
lab, but there it is forgotten, or rather, buried in a mound of archived field sheets: “It
is possible to go back to these data sheets if necessary, but they would have to dig.”
When we asked why these data points made it no further, the glib response was simply
that no entry existed for conductivity in the database. Meanwhile, well-worn columns
of the database were filled with qualitative observations about smells and random field
events.
This labeling ritual, the notations on samples, the checks at each point of transport,
are the cascades of rituals that tie together field sites to samples to databases. We have
only scratched the surface of these events. Our scientists described how samples
are placed in the car just to prevent overturning. Bottles, whether filled or empty, are
transported with sealed lids to prevent cross-contamination. Shifting the contents of
water from one bottle to another or to an instrument involves isolation from other
samples to ensure none are confused. We observed cascades of rituals, from the moment
of a sample's collection in the river to its placement in the lab refrigerator.
At each of these tiny transitions data again threaten to become unruly masses. A
misinscribed sample number, a confusion of two bottles, or the spilling of a sample
during filtration can all threaten the chain that links a date and a field site to a sample
and its eventual transformation into data. A myriad of ritualized activities seek to solidify
this chain, but small mistakes and accidents still occur. At best, a mistake or accident is
caught and a data point is lost. While a single data point is a loss, in the grand scheme
of a longitudinal database, it is a fairly small one. Scientists who use these data expect
such things: anomalies and outliers that must be thrown out, missing data points that
can be interpolated or extrapolated. At worst, a mistake becomes systematic (as with
a misplaced gauge stick), whereby entire sections of a data stream must be reconstructed
or altogether thrown out.
The metaphor of a chain is revealing: it helps us understand the heterogeneous work
of custodianship stretching from field site to lab and from lab to databases. Only at the
end of these mediations can we meaningfully speak of “raw data.” Nevertheless, the
Search WWH ::




Custom Search