Hardware Reference
In-Depth Information
16.1 Motivation
HDF5 [13], short for \Hierarchical Data Format, version 5" is a data model,
library, and file format for storing and managing data. It is designed to meet
the needs of applications that push the limits of what can be addressed by tra-
ditional database systems, XML documents, or in-house data formats. Many
HDF5 adopters have very large datasets, very fast access requirements, or very
complex data relationships. Others turn to HDF5 because it allows them to
easily share data across a wide variety of computational platforms using appli-
cations written in different programming languages. Some use HDF5 to take
advantage of the many open-source and commercial tools that understand
HDF5.
Similar to XML documents, HDF5 files are self-describing and allow users
to specify complex data relationships and dependencies. In contrast to XML
documents, HDF5 files can contain binary data (in many representations) and
allow direct access to parts of the file without first parsing the entire contents.
HDF5, not surprisingly, also allows hierarchical data objects to be ex-
pressed in a very natural manner, in contrast to the tables of a relational
database. Whereas relational databases support storing records in tables,
HDF5 defines n-dimensional datasets where each element in the dataset may
itself be a complex object. Relational databases offer excellent support for
queries based on field matching, but are not well-suited for sequentially pro-
cessing all records in the database or for subsetting based on coordinate-style
lookup, prime use cases for HDF5's capabilities.
In-house data formats are often developed by individuals or teams to meet
specific needs of a project. While the initial time to develop and deploy such
a solution may be quite low, the results are often not portable, not extensible,
and not high performance. In many cases, the time devoted to extending
and maintaining custom data management code takes an increasingly large
percentage of the project's total development eort|in eect reducing the
time available for the primary objectives of the project. HDF5 offers a flexible
format and powerful API backed by over 25 years of development history.
Developers can leverage HDF5's capabilities and tailor them to their needs
by building a thin project- or domain-specific API on top of the HDF5 API,
creating the best of both worlds.
16.2 History and Background
In 1987, the Graphics Foundations Task Force (GFTF) at the National
Center for Supercomputing Applications (NCSA) at the University of Illinois
 
Search WWH ::




Custom Search