Database Reference
In-Depth Information
3.4.4.4
The Storage Classes
.................................
101
3.4.4.5
The Grid Storage Systems Deployment
Working Group
.......................................
103
3.4.5
Beyond WLCG: Data Management Use Cases in EGEE
....
104
3.5
Examples of Using File Streaming in Real Applications
............
105
3.5.1
Robust File Replication in the STAR Experiment
...........
105
3.5.2
The Earth System Grid
.......................................
107
3.6
Conclusions and Future Work
.......................................
108
Acknowledgments
..........................................................
109
References
.................................................................
109
3.1 Introduction and Motivation
Dynamic storage space allocation is a feature that was not available for scien-
tific applications. Therefore, scientific researchers usually assume that storage
space is preallocated and that the application will have enough space for the
input data required and the output data generated by the application. How-
ever, in modern computer systems that support large scientific simulations and
analysis, this assumption is often false, and application programs often can-
not complete the computation as a result of lack of storage space. Increases
in computational power have only exacerbated this problem. Although the
increased computational power has created the opportunity for new, more
precise and complex scientific simulations that can lead to new scientific in-
sights, such simulations and experiments generate ever-increasing volumes of
data. The ability to allocate storage dynamically (on demand), to manage the
content of the storage space, and to provide sharing of data between users are
crucial requirements for conducting modern scientific explorations.
Typically, the scientific exploration process involves three stages of data-
intensive activities: data generation, postprocessing, and data analysis. The
data generation phase usually involves data collection from experimental de-
vices, such as change-coupled devices measuring electric fields or satellite sen-
sors measuring atmospheric conditions. However, increasing volumes of data
are now generated by large-scale simulations, the so-called third pillar of sci-
ence, which now complement theory and experiments. At the data generation
phase, large volumes of storage have to be allocated for data collection and
archiving.
Data postprocessing involves digesting the large volumes of data from the
data-generation phase and generating processed data whose volume can often
be as large as the input data. For example, raw data collected in high en-
ergy physics (HEP) experiments are postprocessed to identify particle tracks
as a result of collisions in the accelerator. The postprocessed data in such
cases is of the same order of magnitude as the raw data. However, in many
Search WWH ::




Custom Search