Database Reference
In-Depth Information
One noticeable limitation of netCDF is the 2 GB file size due to the 4-byte
integers used in the CDL format. PnetCDF lifts this limitation by adopt-
ing CDL version 2 format. However, even though the file size can grow be-
yond 2 GB, CDL version 2 still limits any single array size to 2 GB. The
next generation of CDL format is under development and will remove this
limitation.
The netCDF API is divided into five categories: dataset functions—create/
open/close a file, switch to define/data mode, and synchronize changes to the
file system; define mode functions—define array dimensions and variables; at-
tribute functions—manage adding, changing, and reading attributes; inquiry
functions—return metadata; and data access functions—read/write data in
one of the five access methods (single element, whole array, subarray, subsam-
pled array, and mapped strided subarray). Parallel netCDF retains the file
format of netCDF version 3, and its implementation is built on top of MPI-
IO, allowing users to benefit from existing I/O optimizations adopted in the
underlying MPI-IO library, such as data sieving and two-phase I/O strategies
in ROMIO 21 23 , 41 and data shipping in the IBM's MPI-IO library. 24 , 25 In order
to seemingly integrate with the MPI-IO functions, PnetCDF APIs borrow a
few MPI features, such as MPI communicators, info objects, and derived data
types. An MPI communicator is added to define the participating I/O pro-
cesses between the file's open and close scope. Adding MPI info objects allows
users to pass access hints for further optimizations. The PnetCDF interface's
define mode, attribute, and inquiry functions are collective in order to guaran-
tee data consistency among the participating processes. There are two sets of
parallel data access APIs. The high-level API closely follows the original data
access functions where the read-write buffers must be contiguous in memory
but file access can be noncontiguous. The flexible API provides a more MPI-
like style of access to permit noncontiguous I/O buffers through the use of
MPI datatypes. Similar to MPI-IO, PnetCDF data access functions are split
into collective and noncollective modes. netCDF users with MPI background
will find PnetCDF easy to adopt because of the many features inherited
from MPI.
2.4.4.2 HDF5
Hierarchical data format version 5 (HDF5), 14 developed at the National Cen-
ter for Supercomputing Applications (NCSA), is a widely used high-level
I/O library that serves the same purposes as PnetCDF. HDF5 is a major
revision of HDF version 4. 26 Similar to netCDF, HDF4 allows annotated
multidimensional arrays and provides other features such as data compres-
sion and unstructured grid representation. HDF4 does not support parallel
I/O, and file sizes are limited to 2 GB. HDF5 was designed to address these
limitations.
As a major departure from HDF4, HDF5's file format and APIs are com-
pletely redesigned. An array stored in an HDF5 file is divided into header
Search WWH ::




Custom Search