Database Reference
In-Depth Information
space on the SE, typically corresponding to a file system dedicated to the
VO. A shared file system with quotas would also work. A large MSS with a
tape back-end may be configured such that files in certain subsets of the name
space will be flushed to tape and recalled to disk as needed.
There are at least the following issues with the Classic SE:
It usually is not easy to enlarge the amount of disk space available to a VO
indefinitely. Various modern file systems can grow dynamically when
new disks are made available through some logical volume manager,
but a single file system may become an I/O bottleneck, even when the
file system is built out of multiple machines in parallel (e.g., as a SAN
or a cluster file system). Furthermore, commercial advanced file systems
are expensive and may lead to vendor lock-in, while open-source imple-
mentations have lacked maturity (this is steadily improving, though).
Instead, multiple file systems could be made available to a VO, mounted
on different parts of the VO name space, but it usually is impossible to
foresee in which parts more space will be needed. A site could grow its
storage by setting up multiple GridFTP servers, all with their own file
systems, but that may leave some of those servers idle while others are
highly loaded. Therefore, the desire is for an SE to present itself under
a single name, while making transparent use of multiple machines and
their independent, standard file systems. This is one of the main reasons
for developing the SRM concept.
GridFTP servers lack advance space reservation: A client will have to try to
find out which fraction of its data it actually can upload to a particular
server, and look for another server to store the remainder.
A GridFTP server fronting a tape system has no elegant means to signal
that a file needs to be recalled from tape: The client will simply have
to remain connected and wait while the recall has not finished. If it
disconnects, the server might take that as an indication that the client
is no longer interested in the file and that the recall should therefore be
canceled. Furthermore, there is no elegant way to keep a file pinned on
disk to prevent untimely cleanup by a garbage collector.
A GridFTP server has no intrinsic means to replicate hot files for better
availability. The host name of a GridFTP service could be a round-robin
or load-balanced alias for a set of machines, but then each of them must
have access to all the files. This could be implemented by some choice
of shared file system, or by having the GridFTP server interact with a
management service that will replicate hot files on the fly, making space
by removing replicas of unpopular files as needed. Such functionality is
naturally implemented by an SRM.
In the spring of 2004 the WLCG middleware releases started including SRM
v1.1 client support in data management. The first SRM v1.1 service available
on the WLCG infrastructure (to the CMS experiment) was a dCache instance
Search WWH ::




Custom Search