Hardware Reference
In-Depth Information
Earth's climate. These climate simulations can generate several terabytes,
or even petabytes, of raw data. Such data-intensive applications present a
tremendous set of challenges for computer centers that strive to accommo-
date, maintain, transfer, and preserve data.
Large-scale computing centers that host such applications cannot afford
to frequently stall on data accesses and lose precious computing cycles. The
I/O performance of scientific applications such as climate models has a great
impact on the completion of scientific simulation and post-mortem data visu-
alization and analysis. However, many applications have complex data char-
acteristics that are not well supported by existing parallel I/O libraries, for
example, applications may generate a large number of small variables. While
the data is large in aggregate, each process only holds a very small amount of
data for each variable. An example is one of NASA's climate and weather mod-
els named GEOS-5 (the Goddard Earth Observing System Model), a GEOS-5
simulation at a coarse resolution of half-degree generates only 3.12 GB data,
but consists of 185 2D variables and 80 3D variables at a time. The number
of timesteps can be configured to the order of thousands. It is challenging to
provide good I/O speed for both writing and reading where large amounts
of small I/O requests are expected. Moreover, it is also quite common for
data post-processing to examine data along time dimensions, e.g., observing
the change of temperature within one hour. However, storage data layouts
currently common in use do not provide a good support for such an access
pattern.
In addressing critical climate issues, climatologists and meteorologists are
limited by the available spatial and temporal resolutions of the current climate
and weather models. An I/O method of ADIOS was specifically designed by
Tian et al. [14] to enable higher resolution in output datasets. It leverages the
spatial and temporal relationships between variable data. Spatially, it aggre-
gates and merges data chunks of the same timesteps so that fewer processes
write larger blocks. Moreover, the same variable across different timesteps
are merged together with time as a new dimension, which again reduces the
amount of I/O requested for both writing and reading. This strategy has pro-
vided two orders of magnitude of speedup compared to simple writing (and
reading) approaches.
17.3.3 Code Coupling
ADIOS DataSpaces [6] is a scalable data-staging substrate that supports
advanced coordination and interaction services for extreme-scale coupled sim-
ulation workflows. It provides the abstractions and mechanisms to support
flexible and dynamic inter-application coupling and interactions at runtime.
It also supports asynchronous data insertion and retrieval to/from a staging
area composed of a set of cores on application nodes and dedicated staging
nodes. For example, in case of a simulation-visualization workflow, the simula-
tion can output data to the staging resources at runtime using the DataSpaces
 
Search WWH ::




Custom Search