Database Reference
In-Depth Information
a need to write data into and read data from file systems eciently. Focusing
on files rather than a database systems, simplified the task. Today, there are
several popular parallel file systems that are extensively used with supercom-
puters and large clusters, including Lustre, PVFS, and GPFS, described in
Chapter 2. Furthermore, the libraries for reading and writing the specialized
file formats are being adapted to take advantage of these file systems. The
third, and perhaps the most important reason, is that the logical data model
presented by existing database system, especially the relational data model,
is not appropriate for most scientific data. Much of the scientific data is mul-
tidimensional, such as space-time data for representing simulations of natural
phenomena, or array data representing multivariate data. Furthermore, some
data uses grids that are not regular, such as geodesic data, or toroidal meshes,
and have different data models.
Yet, the concept of physical data independence is very attractive for sci-
entific data as well. It is very attractive to a scientist to be concerned only
with the “abstract data model”, without dealing with files, datasets, file for-
mats, and the file systems involved. As data volumes grow, having to deal
with I/O bottlenecks, matching the application code to the type of data for-
mats and file systems used, is increasing the overhead to the scientists, taking
away productivity in doing their science. Recent activities, such as SciDB
described in Chapter 7 have taken the approach that the data model of the
system should represent multidimensional structures, and other structures
that match the scientific domains. The goal is to have all the data stored in
such scientific database systems, eliminating the burden on the scientist to
deal with the concepts of files, file formats, and various physical organiza-
tion considerations. Rather, the performance requirements will be specified,
and accordingly the database system will choose the most appropriate storage
structures and indexes for accessing the data, and the underlying file struc-
tures to hold the data. Assuming that such scientific database management
systems will become operational over time and will mature, it is still unclear
whether scientists will abandon their practice of storing data in specialized
file formats.
Search WWH ::




Custom Search