Database Reference
In-Depth Information
In order to attain these goals, it is necessary to move structured rather
than unstructured data, meaning, as expressed above in describing the ADIOS
API, eciency in data movement is inexorably tied to knowledge about the
type and structure of the data being moved. This is because such knowledge
makes it possible to manipulate data during movement, including routing
it to appropriate sites, reorganizing it for storage or display, filtering it, or
otherwise transforming it to suit current end-user needs. Next we describe the
ecient, asynchronous data capture and transport mechanisms that underlie
such functionality:
DataTaps —flexible mechanisms for extracting data from or injecting
data into HPC computations; eciency is gained by making it easy to
vary I/O overheads and costs in terms of buffer usage and CPU cycles
spent on I/O and by controlling I/O volumes and frequency. DataTaps
move data from compute nodes to DataTap servers residing on I/O
nodes.
Structured data —structure information about the data being cap-
tured, transported, manipulated, and stored enables annotation or mod-
ification both synchronously and asynchronously with data movement.
I/O graphs —explicitly represent an application's I/O tasks as con-
figurable overlay * topologies of the nodes and links used for moving
and operating on data, and enable systemwide I/O resource manage-
ment. I/O graphs start with the lightweight DataTaps on computational
nodes; traverse arbitrary additional task nodes on the petascale machine
(including compute and I/O nodes as desired); and “end” on storage,
analysis, or data visualization engines. Developers use I/O graphs to
flexibly and dynamically partition I/O tasks and concurrently execute
them across petascale machines and the ancillary engines supporting
their use.
The simple I/O graphs shown in Figure 5.1 span compute to I/O
nodes. This I/O graph first filters particles to only include interesting
data—say, within some bounding boxes or for some plasma species. The
filtering I/O node then forwards the particle data to other I/O nodes,
which in turn forward particle information to in situ visualization clients
(which may be remotely accessed), and to storage services that store the
particle information two different ways—one in which the particles are
stored based on the bounding box they fall in, and one in which the
particles are stored based on the timestep and compute node in which
the information was generated.
* Overlay networks are virtual networks of nodes on top of another physical network. For the I/O
graphs, data moves between nodes in the I/O graph overlay via logical (virtual) links, whereas in
reality it may traverse one or more physical links between the nodes in the underlying physical
network.
Search WWH ::




Custom Search