Database Reference
In-Depth Information
calibration and alignment parameters. Therefore, it cannot be executed en-
tirely at CERN. Raw data is stored on tape at CERN and streamed to Tier-1
sites where the reconstruction program should start shortly on data that just
arrived. For this use case, the storage requirements are the following:
Specific data transfer servers with wide area network (WAN) access and
adequately large buffers need to be in place in order to eciently receive
data coming from the Tier-0.
Discovery functions should allow for the identification of the data services
and buffers dedicated to the given experiments.
Data transfer services should allow for reliable and secure transfer of big
buffers of data. Such services should provide users with transfer schedul-
ing and retry functionalities.
Data transfer servers must be connected to the tape storage systems for
persistent storage of the data.
A proper storage interface to MSSs should be available in order to trigger and
control store and stage operations in an implementation-independent
way.
Given the amount of data involved, it is desirable to avoid making multiple
copies of the data. Therefore, the data needs to remain on disk for a
time sucient to reconstruct it, before it is deleted to make space for
new data. The pinning functionality of SRMs allows for specifying a
lifetime associated to the data stored in a given space.
For a critical operation such as reconstruction of physics data, it is manda-
tory not to compete for resources with other experiments. Therefore,
dedicated resources are normally required by the experiments.
Furthermore, it is important that user activities do not interfere with produc-
tion or import/export activities. Support is required for access control
lists on spaces provided by the storage services, as well as mechanisms
to block unwanted types of access to specific data buffers.
3.4.3.2
Mainstream Analysis
This use case can be considered as the standard, scheduled activity of a physics
group in a certain university. The research group is interested in analyzing a
certain dataset (typically consisting of many gigabytes or several terabytes
of data) in a certain Tier-1 center that has free computing capacity. If the
data is not available at that site, it needs to be transferred in a scheduled
way and the operation might last for a few days. Once the data has arrived,
computing-intensive physics analysis operations can be done on the specified
data. For instance, the validation of reconstructed data is a process in which
the validity of the used algorithms and parameters is assessed. This process
implies access to 1%-2% of the total reconstructed data of an experiment.
Search WWH ::




Custom Search