Database Reference
In-Depth Information
1. The underlying I/O infrastructure only supports a partitioning scheme
fixed by the simulation when the file(s) were created. Most of the time,
this scenario corresponds to having one atomic chunk of data for each
processor. Examples of I/O libraries of this type are Silo 6 and Exodus. 7
Other examples include file-per-processor output, which may be a good
way to achieve I/O performance for the simulation's data-write phase,
but has undesirable consequences for processing tools that read their
data. Those processing tools are then forced to reconcile between the
simulation's degree of parallelism and their own. For example, the sim-
ulation may decompose a three-dimensional space into 1,000 pieces, but
the processing tool may be running with only five processors. In this
case, the processing tool must find a way to partition the pieces across
its processors, either by combining all of the pieces on a given proces-
sor into one large piece or by respecting piece layout and supporting
multiple pieces per processor.
2. The underlying I/O infrastructure supports re-partitioning during read.
Most of the time, this scenario corresponds to having all of the data
in one large file, with the I/O infrastructure supporting operations like
hyperslab reads, collective I/O, and so forth. Examples of formats that
can repartition data in this manner are ViSUS, 8 SAF, 9 and HDF5. 10
These two scenarios are well supported by the major, parallel, production
visualization tools, although the scenarios are supported differently in terms of
how the tools do parallel partitioning. For the first case (imposed partitioning),
each subset of the partition normally consists of domains , where each domain
consists of the portion operated on by a single processor. In this case, the
visualization tool distributes the domains across its processors. For the second
case (adaptive partitioning), the visualization tool forms its own partition of
the dataset by having each processor read in a unique piece. In both cases, it
is important that each processor has an approximately equal amount of data
to read, which correlates strongly with work to be performed in subsequent
stages. From a scientific data management (SDM) perspective, the summary
is that both ways of writing data are acceptable.
9.2.1.2
Processing
The modern parallel visualization tools all use a data flow network processing
design. 11 - 13 Data flow networks have base types of data objects and components
(sometimes called process objects). The components can be filters , sources ,or
sinks . Filters have an input and an output, both of which are data objects.
Sources have only data object outputs, while sinks have only data object
inputs. A pipeline is an ordered collection of components. Each pipeline has
a source (typically a file reader) followed by one or more filters (for example
Atomic in the sense that partial reads of the chunk of data are not possible.
Search WWH ::




Custom Search