Hardware Reference
In-Depth Information
less, the data will be deleted. Purges may also occur based on the file system
free space being below a given threshold. This forces the users of the system
to transfer their data of of the scratch file system to some other longer term
storage location. This could be another file system (typically larger, but slower,
and therefore less expensive), tape, or transferred out to another institution.
While this tends to be cost effective, it is very inconvenient to the users, to
move data around. This time spent moving data is also overhead that does
not directly contribute to the solution of the problem.
When designing the I/O system at ALCF, the typical HPC I/O system
criteria were considered: integrity (the data read back is the data that was
written), bandwidth (how fast can the data be written to the storage system),
stability (how often do things break), and resiliency (how does the system
respond when things do break). ALCF also added the user's experience to
their list of requirements. Their goal is to create better, user-friendly storage
systems by not having the typical scratch design, and not requiring users
to move data as much. As noted, each supercomputer (Intrepid and Mira)
has its own complete ecosystem of storage, network, analysis cluster, support
infrastructure, etc. In order to meet the goal of avoiding unnecessary data
movement, each supercomputer complex has a single storage system that is
mounted on all user-facing nodes. This results in peak bandwidths that tend
to be lower than other comparable sites, but the storage system is large enough
that users can generally keep all their data in a single place on disk until they
are finished with it. Then users can transfer the data to tape, or out of the
facility for long-term archival storage.
4.3 I/O Hardware
The ALCF essentially runs two independent facilities centered around the
two Blue Gene supercomputers. The systems are on two different networks,
each running independent instances of all the required services such as au-
thentication, domain name servers, etc.
At a high level, the I/O system design is the same on each of the ma-
chines. As is common on the largest supercomputers, the Blue Gene family
uses dedicated I/O nodes each of which services the I/O needs of a subset of
the compute nodes. It is simply not practical to build a storage fabric with
tens of thousands of endpoints on it. User's computational jobs make le sys-
tem calls on the compute nodes. These file systems calls are intercepted and
forwarded by the operating system to the I/O nodes. The I/O nodes option-
ally do aggregation and optimization of the calls coming in and then replay
the system calls on behalf of the compute nodes. The configuration of each
system is covered in detail below.
 
Search WWH ::




Custom Search