Hardware Reference
In-Depth Information
(a)
(b)
FIGURE 18.2 (See color insert): Aggregation groups formed within a set of
nodes by GLEAN (a) with aggregator nodes highlighted, and (b) aggregator
node configuration used by MPI I/O.
has a 6D torus. An important goal in GLEAN is to leverage such topologies
to move the data out of the machine as soon as possible, thereby enabling the
simulation to continue on with its computation.
MPI collective I/O uses AllToAllv wherein, depending on an aggregator's
rank, the aggregation trac can cross the partition boundaries of the ag-
gregator group. This leads to global communication and network contention.
However, GLEAN restricts the aggregation trac within the group boundary.
This also reduces the synchronization requirements to just the processes in-
volved in an aggregation group, typically 64 nodes for BG/P and 128 nodes
for BG/Q, instead of all the nodes as in the MPI collective I/O case. This
significantly improves performance with larger core counts by reducing the
global communication. Figure 18.2(a) shows GLEAN's selection of 8 aggre-
gator groups formed for 64 nodes. Each group is depicted by a distinct color
(or shade of gray) and the aggregator node for the group is highlighted. For
comparison, Figure 18.2(b) shows the 8 aggregator nodes used in MPI-IO col-
lective operations. In this case, the first 8 nodes are selected as aggregators
and highlighted. Note that the aggregation trac in the MPI collection I/O
is global and not restricted to an aggregator group.
18.2.2 Leveraging Application Data Semantics
A key goal in designing GLEAN is to make application data semantics a
top priority. This enables GLEAN developers to apply various analytics to
the simulation data at runtime to reduce the data volume written to storage,
transform data on-the-fly to meet the needs of analysis, and enable various
I/O optimizations leveraging the application's data models. This eort has
worked closely with FLASH (see Section 18.3.2) [3], an astrophysics applica-
tion used to capture FLASH's adaptive mesh renement (AMR) data model,
 
Search WWH ::




Custom Search