Scientific Data Management Challenges in High-Performance Visual Data Analysis - Scientific Data Management

Database Reference

In-Depth Information

to scientific data management: they must support a plethora of input data

formats, and therefore, they contain a number of data loader modules; they

must create an internal data structure that is suitable for use by a potentially

large collection of visualization, analysis, and rendering modules, all of which

may potentially run in parallel on shared or distributed memory machines. An

open problem is one of data models and semantics, where meaning is assigned

to arrays of data stored in data files.

With data of massive scale, it is often useful to perform a multiresolution

analysis, working first with a smaller, coarser version of data, then progres-

sively refine the analysis as interesting features are revealed. We saw that a

space-filling curve model has proven to be highly ecient for interactive anal-

ysis of massive data. However, such a data model and layout is unlikely to be

output directly from a simulation. This is a good example of how a data model

and layout that works very well for multiresolution analysis is unlikely to be

used by simulations for output. Multiresolution, quantitative feature detection

and analysis methods were demonstrated on two different datasets from the

field of turbulent mixing. This quantitative analysis approach is very useful for

enabling scientific knowledge discovery by focusing on features rather than on

the machinery for creating potentially incomprehensible images of large-scale

scientific data.

A significant barrier faced by many computational and experimental sci-

ence projects is the complexity of using state-of-the art technology from sci-

entific data management. We discussed an approach for encapsulating com-

plexity in the form of a high-level API for data storage and retrieval that

lowers the entry point for using such technology. This concept was also ap-

plied to index/query technology. We have applied these concepts to multi-

ple application areas to produce results showing that visual data analysis,

as a field, can benefit from a close collaboration with the field of scientific

data management. The benefit is improved performance for many data in-

tensive operations, like data I/O and data subsetting, as well as the poten-

tial to conceive and create completely new, paradigm-changing approaches

to solve the problem of scientific knowledge discovery for massive, complex

datasets.

Acknowledgments

This work was supported by the Director, Oce of Advanced Scientific Com-

puting Research, Oce of Science, of the U.S. Department of Energy, under

Contract No. DE-AC02-05CH11231 through the Scientific Discovery through

Advanced Computing (SciDAC) program's Visualization and Analytics Cen-

ter for Enabling Technologies (VACET). This research used resources of the

Search WWH ::

Custom Search

Home