Standardized Multi-protocol Data Management for Grid and Cloud GridRPC Frameworks - Data Management in Cloud, Grid and P2P Systems

Database Reference

In-Depth Information

(for the moment the information is static in the configuration file. We intend

to look if it makes sens to have it self-tuned by the library, depending on the

dynamicity of both the network and computing performance). It also shows that

every computations occur when enough matrices are finished to be downloaded.

As a side effect, it also confirms the observations made in [8]: there's a real need

to limit the number of possible parallel transfers. We can indeed observe on this

small example that the overall completion time of the addition of the 16 matrices

is a bit reduced when the limit is fixed to 2 for our small testbed.

5 Future Works

Future works are heading towards different directions. If the library is already

usable and implements most of the API, more performance can be obtained

with more ecient scheduling: at the request controller (Section 3.2), and at the

dispatcher level (Section 3.3); and more development: for example including a

middleware module for ssh would add more scheduling possibilities; the proto-

col memory leads to already complex data management mechanisms, yet to be

continued in addition to a file protocol that would help avoiding useless data

copies, making the use of the library even more scalable. Modules for dCache

and GridFTP would possibly make transfers faster, but further control would

have to be done on the bandwidth consumption; a data manager module for

Amazon S3 3 would give further access to cloud storage resources leading for a

need to also take into account some financial criteria in the above scheduling

process, and possible migration of data when possible ( e.g., when the data is

requested as GRPC PERSISTENT).

6Conluon

With the GridRPC Data Management standard completing previous works on

GridRPC, both at the API and software level, feasibility of computations and

performance is at reach with immediate portability and interoperability be-

tween GridRPC middleware. To ease its spread, while giving access to GridRPC

middleware and to existing data managers, we provide an implementation of

both APIs relying on a very modular architecture. Fulfilling the standard re-

quirements, the library also implements the data management modes as well

as a memory protocol to avoid useless copy to disk. We showed that an e-

cient system to handle waiting mechanisms is in place and that we operate

some mapping/scheduling when several transfers are involved in the same data

management. We conducted some experiments and obtained results validating

the expected behaviors. From now on we will focus on more theoretical work

to improve the yet non-trivial mapping/scheduling of transfers involved for a

given data, and we are considering to plug a workflow/dataflow analyzing tool

to schedule transfers of different data with remote procedure calls altogether.

3 http://aws.amazon.com/

Search WWH ::

Custom Search

Home