Database Reference
In-Depth Information
5.4 Conclusions
As the complexity and scale of current scientific and engineering applications
grow, managing and transporting the large amounts of data they generate is
quickly becoming a significant bottleneck. The increasing application runtimes
and the high cost of high-performance computing resources make online data
extraction and analysis a key requirement in addition to traditional data I/O
and archiving. To be effective, online data extraction and transfer should im-
pose minimal additional synchronization requirements, should have minimal
impact on the computational performance, maintain overall QoS, and ensure
that no data is lost.
A key challenge that must be overcome is getting the large amounts of data
being generated by these applications off the compute nodes at runtime and
over to service nodes or another system for code coupling, online monitor-
ing, analysis, or archiving. To be effective, such an online data extraction and
transfer service must (1) have minimal impact on the execution of the sim-
ulations in terms of performance overhead or synchronization requirements,
(2) satisfy stringent application/user space, time, and QoS constraints, and
(3) ensure that no data is lost. On most expensive HPC resources, the large
numbers of compute nodes are typically serviced by a smaller number of ser-
vice nodes where they can ooad expensive I/O operations. As the result, the
I/O substrate should be able to asynchronously transfer data from compute
nodes to a service node with minimal delay and overhead on the simulation.
Technologies such as RDMA allow fast memory access into the address space
of an application without interrupting the computational process, and provide
a mechanism that can support these requirements.
In this chapter we described the ADIOS I/O system and underlying mech-
anisms, which represent a paradigm shift in which I/O in high-performance
scientific application is formulated, specified, and executed. In this new
paradigm, the construction of the writes and reads within the application
code is decoupled from the specification of how that I/O should occur at
runtime. This allows the end user substantial additional flexibility in mak-
ing use of the latest in high-throughput and asynchronous I/O methods
without rewriting (or even relinking) their code. The underlying mecha-
nisms include low-level interfaces which enable lightweight data capture, asyn-
chronous data movement, and specialized adaptive transport services for MPP
and wide-area systems. Our experiences with a number of fusion and other
codes have demonstrated the effectiveness, eciency, and flexibility of the
ADIOS approach and the accompanying technologies such as DataTaps, I/O
graphs, DART, and the autonomic data management, transport, and pro-
cessing services. These services use metadata that effect I/O operations and
access of parallel file systems. Other aspects of metadata are discussed in
Chapter 12.
Search WWH ::




Custom Search