Information Technology Reference
In-Depth Information
unique challenges for executing large-scale sci-
entific applications.
This research is motivated by the need to sup-
port fault-tolerant communication within scientific
workflows. A workflow consists of multiple pro-
cessing stages, where intermediate data generated
in one stage are processed in subsequent stages.
A workflow component can be a device or an
application, which is often modified to enable
communication. Thus, a scientific workflow is a
computational/data-processing pipeline; with data
being captured, processed and manipulated as it
pass through various stages (Figure 1). Currently,
the data transfers between component applications
are realised by: (a) file transfers (e.g. GridFTP);
(b) remote procedure calls (e.g. RPC-V, GridRPC,
OmniRPC); and (c) custom mechanisms (e.g.
Web Services).
For coupling workflow components, we pro-
pose the π-channel, an asynchronous and persis-
tent pipe mechanism. It is part of the π-Spaces/π-
channels programming model which features:
sence of the reader. Persistence also makes
π-channels accessible even after the writer
has terminated.
3. Asynchronous receives are made possible
through a communication thread; thus, an
application is able to accept pipe segments
even when it is busy in computation.
This article focuses on how π-channel per-
sistence relates to fault-tolerant communication
in scientific workflows. The extended API and
semantics for π-Space/π-channels are presented.
We describe the design and implementation of
π-channels, including the server that implements
this model along with the underlying distributed
algorithm.
This article is organised as follows: We review
related work in the next Section § 2. Then, we
present the π-Spaces/π-channels programming
model in § 3, including its application program-
ming interface, semantics, and how fault-tolerance
is achieved for workflows. In § 4, we discuss in
detail its design and implementation, describing
the distributed algorithm. Experimental results
are presented in § 5, followed by the conclusions.
1. Simplified application coupling using string
channel names through π-Spaces. A π-Space
is a name space for π-channels, enabling dy-
namic binding of channel endpoints between
processes.
2. π-channel data are adaptively cached to
achieve persistence. This allows π-channels
to be created and written to, even in the ab-
RELATED WORK
We briefly review the major models for commu-
nication on distributed environments highlighting
their differences from π-Spaces/π-channels.
Figure 1. A simple four-stage workflow applica-
tion. Arrows indicate data flow between compo-
nent applications. Application B is an n-process
parallel application.
Pipe/Channel Models
The pipe/channel is a well-known IPC mechanism
and appears in many forms: Unix pipes, named
pipes, and TCP sockets (Stevens, 1998). Sockets
with TCP, while used in network programming,
are too low-level for scientific application pro-
gramming. In particular, since communication
endpoints are identified using IP/host addresses
and port numbers, it is tedious to use in a dynamic,
failure-prone environment. In the event of a link
Search WWH ::




Custom Search