Database Reference
In-Depth Information
Standardized Multi-protocol Data Management
forGridandCloudGridRPCFrameworks
Yves Caniou 1 , Hadrien Croubois 2 ,andGael Le Mahec 3
1 Universite de Lyon, JFLI CNRS, Japan
Yves.Caniou@ens-lyon.fr
2 Universit´edeLyon,ENS Lyon
Hadrien.Croubois@ens-lyon.fr
3 Universite de Picardie Jules Verne, MIS Laboratory, France
Gael.Le.Mahec@u-picardie.fr
Abstract. GridRPC is an international standard of the Open Grid Fo-
rum defining an API designed to allow applications to be submitted
in a seamless way on large scale, heterogeneous and geographically dis-
tributed computing platforms. First versions of the standard did not take
into account any data management feature. Data were parameters of the
Remote Procedure calls, without any possibility to prefetch them, to use
persistence, replication, external sources, etc. , and making GridRPC
codes middleware dependent. The data extension of the standard intro-
duced a short set of functions and data structures to complete the API
with simple but powerful data management features. In this paper, we
present a modular and extensible implementation of both APIs, which
needs only a few developments to be usable with any middleware relying
on RPC, and which provides access to numerous and easy to extend pro-
tocols and data middleware to access data. Gaining data management
functions, it introduces interesting potentiality for optimization that such
an approach would provide to large scale applications.
1 Introduction
Many applications use RPC-like mechanisms to distribute computations over
nodes of clusters and supercomputers composing some distributed systems like
a grid, a cloud, or both (now referred as sky computing). Combined with con-
nections to huge databases, they more or less transparently provide scientists
with the possibility to focus on their core thematic, giving them more time to
deal with data analysis, without dealing with the underlying complexity of all
the different mechanisms involved into job and data management. More lately
applications even directly couple analysis, graphical representations and such,
making platform management only a part of their project, whose actions are gen-
erally available through some web site. And surprisingly, when considering a new
area, a new platform, new independent pieces of software are often developed
instead of using previous work, software or standardized APIs.
The Open Grid Forum standard defining the GridRPC paradigm, namely
Remote Procedure Call over the Grid, has been published in 2007, benefiting
 
Search WWH ::




Custom Search