Scientific Data Management Challenges in High-Performance Visual Data Analysis - Scientific Data Management

Database Reference

In-Depth Information

Summarizing the entire section, data representation issues and designs of file

formats are a critical issue for visualization tools. First, the visualization tool

needs to be aware of most of the data that the simulation code itself is aware of,

simply because much of that information is directly visualized or needed for

proper visualization. Second, additional metadata can enable optimizations

and greatly improve the performance of a visualization tool. Third, data layout

issues, such as the way data can be partitioned for parallelization, are very

important and can have a profound impact on end-to-end performance and

usability.

9.3 Multiresolution Data Layout for Large-Scale Data

Analysis

In recent years, computational scientists with access to powerful supercom-

puters have successfully simulated fundamental physical processes with the

goal of shedding new light on our understanding of nature. Such simulations

often produce massive amounts of data: grids of size 1024 3 to 4096 3 at multiple

timesteps and dozens of variables per grid point are not uncommon. This data

must be visualized and analyzed to verify and validate the underlying model,

to understand the phenomenon in detail, and to develop new insights into

fundamental physics. Both data visualization and data analysis are vibrant

research areas, and much effort is being spent on developing advanced, new

techniques to process the massive amounts of data produced by scientists. In

this section, we describe a multiresolution data layout, which provides the abil-

ity for quick access to data at varying levels of resolution, from coarse to fine.

9.3.1 Background

To provide context, we highlight these two components in a typical visualiza-

tion and analysis pipeline shown in Figure 9.3. We assume that raw data from

simulations is available as real-valued, regular samples of space-time. Due to

the large size of datasets, we emphasize that all data samples cannot all be

loaded into main memory at once; it is not feasible to use standard implemen-

tations of visualization and analysis algorithms on these large datasets.

Reordering this raw data into a suitable multiresolution data layout can

improve the eciency of both visualization and analysis. Multiresolution lay-

outs enable interactive visualization by allowing the user to first load the

data at a coarse level, then progressively refine by adding more samples to

obtain a more detailed view. Classical schemes, for example, those based on

bricking or chunking, do not readily support the type of data access required

for progressive or multiresolution techniques. In the following, we describe

our hierarchical Z-order data layout scheme. It builds on the coherent layout

Search WWH ::

Custom Search

Home