Database Reference
In-Depth Information
These data sources are combined with published data including satellite im-
agery in simulations to predict oceanographic features, such as seasonal vari-
ations in water levels. 85 Using the current data analysis routines, it is dicult
to ascertain how the results of these simulations were produced because the
information is spread in log files, scripts, and notes. 85
To address this problem, the Monterey Bay Aquarium Research Institute
has developed their own in-house system, the Shore Side Data System (SSDS),
for tracking the provenance of their data products. 86 These data products
range from lists of deployed buoys to time series plots of buoy movement. Using
SSDS, scientists can access the underlying sensor data, but most importantly,
they can track back from derived data products to the metadata of the sensors
including their physical location, instrument, and platform. A key part of the
system is the ability to automatically populate metadata fields. For example,
by understanding that the position of an instrument is caused by the fact that
it is located on a mooring platform, the system can traverse the provenance
graph to fill in the position metadata for that instrument [86]. It is interesting
to note that the metadata produced by SSDS is in the netCDF standard
format, which was previously discussed in Chapter 2, Section 2.4. SSDS is an
example of a production provenance system as it has been used daily for the
past four years for managing ocean observation data. 86
The focus of SSDS is tracking the provenance of datasets back to the instru-
ments and the associated configuration metadata. A more complex example
of provenance in oceanography is given in Howe et al. 85 In this work, the au-
thors present a system that combines sensor data products with simulations
to present 3D visualizations of fishery data. Specifically, for the Collaborative
Research on Oregon Ocean Salmon Project, they combined data about the lo-
cation and depth of where salmon were caught in the Northwest of the United
States with simulation data about ocean currents to generate visualizations of
the depth and distribution of fish when looking at the continental shelf. 85 The
key use of provenance here is to enable the scientists to explore the parameter
space of a visualization without having to worry about tracking the changes
to their visualization pipeline. For example, to see a different perspective on
the fish, the scientist may have to reconfigure the model they are using. With
the VisTrails 87 system, (described in Chapter 13, Section 13.5) they can eas-
ily find the changes they made or go back to other visualization pipelines.
This functionality is critical when dealing with these complex oceanographic
applications that integrate a variety of simulation techniques and data
sources.
12.6.4 End-to-End Provenance for Large-Scale
Astronomy Applications
We have seen the need to use provenance to re-create data on demand for
satellite imagery, automatically populate metadata fields for oceanographic
data, and track the changes in pipelines for visualizing salmon catches.
Search WWH ::




Custom Search