Information Technology Reference
In-Depth Information
top of the low level grid middleware (e.g. Globus toolkit [45] and UNICORE [51]),
through which the workflow management system invokes services provided by grid
resources. Some representative grid workflow systems include ASKALON [43],
GrADS [45], GridAnt [45], Gridbus [46], GridFlow [45], Kepler [47], Pegasus [48],
Taverna [49] and Triana [50]. In [59], comparisons of several representative Grid
workflow systems are given in aspects of (1) scheduling architecture, (2) decision
making, (3) planning scheme, (4) scheduling strategy, and (5) performance estima-
tion. The work in [24] also conduct a survey about the support of temporal QoS
(Quality of Service) in popular scientific workflow systems. In a scientific workflow
system, at build-time stage, scientific processes are modeled or redesigned as
workflow specifications which normally contain the process structure, the functional
requirements for workflow activities and their non-functional requirements such as
QoS constraints on time, cost, reliability, security and so on. At runtime stage,
workflow instances are executed by employing the computing and data sharing ability
of the underlying computing infrastructures and software services [27].
With the emerging of the latest cloud computing paradigm, the trend for distri-
buted workflow systems is shifting to cloud computing based workflow systems, or
cloud workflow systems for short. Given the advantages of cloud computing, cloud
workflow systems are being widely used as platform software (or middleware servic-
es) to facilitate the usage of cloud services. In [44], CloudBus workflow management
system deploys and manages job execution using Aneka which acts as a cloud mid-
dleware. In [41], using Montage on top of the Pegasus-WMS software, the authors
investigate the differences between running scientific workflows on the cloud and on
the grid. SwinDeW-C is a peer-to-peer based prototype cloud workflow system which
is running on the SwinCloud cloud computing test bed mainly for scientific workflow
applications such as pulsar searching [34]. Meanwhile, since Hadoop is almost a de
facto standard for processing large datasets in the cloud, commercial public clouds
such as Amazon Web Service and Microsoft Windows Azure have provided Hadoop
clusters to enable scientific computing on the public cloud [1, 37].
Unlike traditional workflow systems which mainly invoke and execute software
components using their own local software repository, scientific cloud workflow sys-
tems utilize software services in the cloud which are accessed through the Internet and
executed at the service provider's infrastructure [4, 18]. A cloud workflow instance
consists of many partially ordered cloud services, and probably from a number of
different service providers. Therefore, the quality of a cloud workflow application is
determined by the collective behaviors of all the cloud software services employed by
the workflow application. Given the uncertainty lies in every cloud service, the quali-
ty of a cloud workflow instance becomes a much more complex combinatorial prob-
lem. Specifically, how to guarantee the delivery of satisfactory temporal QoS namely
to achieve high on-time completion rate of scientific cloud workflows, is a big chal-
lenge [26]. This is because in the real world, scientific workflow normally stays in a
temporal context and is often time constrained to achieve on-time completion of cer-
tain goals. For example, a weather forecast scientific workflow must be finished be-
fore the weather forecast program shown everyday, for instance, 6:00pm. A pulsar
searching scientific workflow needs to be completed within 24 hours so as to meet the
Search WWH ::

Custom Search