Information Technology Reference
In-Depth Information
Resource Management Application API
specification, an OGF standard.
application can be successfully ported to the grid.
Therefore, in the following, we present a brief
overview of the main data management related
aspects, tasks and issues which might affect the
process of grid-enabling an application, such as
data types and size, shared data access, temporary
data spaces, network bandwidth, time-sensitive
data, location of data, data volume and scalability,
encrypted data, shared file systems, databases,
replication, and caching. For a more in-depth dis-
cussion of data management related tasks, issues,
and techniques, we refer to Bart Jacob's tutorial on
application enabling with Globus (Jacob, 2003).
Stream package - provides methods for
authenticated local and remote socket con-
nections with hooks to support authoriza-
tion and encryption schemes.
RPC package - is an implementation of the
OGF GridRPC API definition and provides
methods for unified remote procedure calls.
The two critical aspects of SAGA are its sim-
plicity of use and the fact that it is well on the road
to becoming a community standard. It is important
to note, that these two properties are provide the
added value of using SAGA for Grid application
development. Simplicity arises from being able
to limit the scope to only the most common and
important grid-functionality required by applica-
tions. There a major advantages arising from its
simplicity and imminent standardization. Stan-
dardization represents the fact that the interface is
derived from a wide-range of applications using
a collaborative approach and the output of which
is endorsed by the broader community.
More information about the SAGA C++
Reference Implementation (developed at the
Center for Computation and Technology at the
Louisiana State University) and various aspects of
Grid enabling toolkits is available on the SAGA
implementation home page (SAGA, 2006). It also
provides additional information with regard to
different aspects of Grid enabling toolkits.
Shared Data Access
Sharing data access can occur with concurrent jobs
and other processes within the network.
Access to data input and the data output of
the jobs can be of various kinds. During the plan-
ning and design of the Grid application, potential
restrictions on the access of databases, files, or
other data stores for either read or write have to
be considered. The installed policies need to be
observed and sufficient access rights have to be
granted to the jobs. Concerning the availability of
data in shared resources, it must be assured that at
run-time of the individual jobs the required data
sources are available in the appropriate form and
at the expected service level. Potential data access
conflicts need to be identified up front and planned
for. Individual jobs should not try to update the
same record at the same time, nor dead lock each
other. Care has to be taken for situations of con-
current access and resolution policies imposed.
The use of federated databases may be use-
ful in data Grids where jobs must handle large
amounts of data in various different data stores,
you. They offer a single interface to the applica-
tion and are capable of accessing data in large
heterogeneous environments. Federated database
systems contain information about location (node,
database, table, record) and access methods (SQL,
VSAM, privately defined methods) of connected
GRID APPLICATIONS AND DATA
Any e-science application at its core has to deal
with data, from input data (e.g. in the form of output
data from sensors, or as initial or boundary data),
to processing data and storing of intermediate
results, to producing final results (e.g. data used
for visualization). Data has a strong influence
on many aspects of the design and deployment
of an application and determines whether a Grid
Search WWH ::




Custom Search