Database Reference
In-Depth Information
algorithms need to be rerun, and new reconstructed data needs to be
stored.
In this use case, there is a request for fast access to a substantial subset
of the data and for a large amount of computing power at peak times. This
may involve transferring raw data from tape to disk. Many tape drives can
thus be busy in this task that typically has high priority. Once the set of
calibration parameters proves to be accurate, it is stored in experiment-specific
databases that are distributed to a few sites for reasons of performance and
fault tolerance.
3.4.3.4
Chaotic Analysis
In contrast to the scheduled “mainstream analysis” of a particular physics
group, here a single physicist working on a specific analysis might request
access to a dataset that can be of any size, that is, it is not known a priori
how much data would need to be made available locally or accessed through
the WAN.
This use case is of particular importance for physicists, system administra-
tors, operators, and developers, since it can create worst-case scenarios that
stress the system. This use can also help detect scalability issues in many
parts of the data access and storage system.
Because of this unpredictable behavior, it is very important to be able to
control storage resource usage and access accurately in order to prevent prob-
lems. In particular, quota and dynamic space reservation become essential.
Also important is the ability to control data and resource access through lo-
cal policies and access control lists. For instance, the capability of staging files
from tape to disk or to store results permanently on tape should be allowed
only to users with certain roles and belonging to specific groups. Data pro-
cessing managers within each experiment are allowed to check the resources
available and ensure correct usage. They need to check for file ownership,
correct placement, sizes, and so forth. They can delete files or move them to
storage with appropriate quality of service whenever needed.
3.4.4 Storage Requirements
In this section we describe the current state and the continuous evolution of
storage services available on the WLCG and EGEE infrastructures.
3.4.4.1 The Classic Storage Element and SRM v1.1
A grid-enabled storage facility is called a storage element (SE). Originally,
an SE was nothing more than a GridFTP server in front of a set of disks,
possibly backed by a tape system. Such a facility is called a classic SE. It was
the first storage service implementation in the WLCG infrastructure. Many
tens of classic SEs are still there today, but they are used by VOs other than
the LHC experiments. Each supported VO has access to a part of the name
Search WWH ::




Custom Search