Information Technology Reference
In-Depth Information
Management (SRM) systems [SRM], cloud-related Simple Storage Services (S3)
[S3], as well as logical
file systems such as LCG File Catalogs [LFC], and inte-
grated Rule-Oriented Data-management Systems (iRODS) [IRODS] can potentially
be used from within distributed computing infrastructures such as grids or clusters.
Each storage solution has been developed to serve speci
c goals by ful
lling
speci
c criteria. However, accessing these storage resources typically requires
dedicated protocols and tools to allow users to organize, manage, upload or
download data via a graphical or command line interface. Selecting a suitable
storage resource for our requirements thus involves setting up the appropriate
software environment, and learning how to use the related tools.
Various user communities wish to exploit the computational power of distributed
infrastructures. It generally cannot be expected that these communities
including
agronomists, biologists, and astronomers
have deep IT expertise, as technical
details of the underlying infrastructure are out of the domain of their main interest.
Providing user-friendly, graphical user interfaces is thus of high practical impor-
tance. Most storage resources and tools provided today, however, only partly ful
ll
this criterion, which makes wider use of storage resources dif
cult or impossible.
As technology evolves or requirements change, it may become necessary to
move our existing data from one storage to another. Although there exist tools that
can connect our local machines to a particular storage resource (e.g., GridFTP GUI
[GridFTP GUI], DragonDisk [DragonDisk], Cyberduck [Cyberduck]) as well as
tools capable of transferring data between storage resources of the same type
(Globus Online [GlobusOnline], Transmit [Transmit]), it is generally an unsolved
issue to migrate data between storage resources of different types. Downloading
data to our local machine, then uploading to the new storage is often not feasible
(due to disk capacity or
file size limits), therefore there is a practical need for a tool
that enables transferring of data between different storage resources.
In a distributed computation infrastructure, computing elements require the
appropriate application programming interfaces (APIs) to access remote storage
resources, to fetch input data, and store computational outputs, respectively. Such
an API for the selected storage resource may not be available on the computing
elements in the given infrastructure by default. Writing code that is only required to
access storage resources into application logic is generally avoidable; the require-
ments of preinstallation of the storage-access libraries limit the portability of the
application. Therefore, providing a solution that allows uniform access to different
storage resources in a way that they are available in any distributed infrastructure is
of high importance.
This chapter proposes a solution called Data Avenue to address the problems
above. Data Avenue provides a web-based, intuitive graphical user interface, which
requires no software installation or particular learning to use. It completely hides the
technical details of connecting to a particular storage resource, and provides a
uniform rendering of the set of data (
(files and folders), which allows users to easily
manage, organize, upload and download data. Data Avenue is capable of per-
forming data migration from one storage to another without additional effort; and
finally, using
HTTP tunneling
provided by Data Avenue, computing elements
Search WWH ::




Custom Search