Information Technology Reference
observation schedule of the telescope . Failure of on-time completion will deteri-
orate the value of the workflow output.
In recent years, workflow temporal verification becomes the major approach for
the assurance of temporal QoS and an important research topic in the workflow area
. As an important dimension of workflow QoS constraints, temporal constraints
such as global deadlines and local milestones are often set at build time and verified at
run time to ensure targeted on-time completion rate of scientific workflows.
Workflow temporal verification, as one of the fundamental workflow system functio-
nalities, is often implemented to monitor workflow runtime execution to maintain
targeted temporal QoS . However, given a large-scale data and computation inten-
sive scientific workflow application and its dynamic cloud computing infrastructure,
systematic investigation is required. Therefore, great efforts have been dedicated to
the area of workflow temporal verification in recent years and it is high time that we
should define the key research issues for scientific cloud workflows in order to keep
our research on the right track.
The remainder of the paper is organized as follows. Section 2 presents a motivating
example and then introduces a generic temporal verification framework with the four
basic research issues and their state-of-the-art solutions. Section 3 further discusses
the open challenges and presents some potential research directions. Section 4 intro-
duces SwinDeW-V, an ongoing research project on temporal verification in our
SwinDeW-C ( Swin burne De centralized W orkflow for C loud) cloud workflow sys-
tem. Finally, Section 5 addresses the conclusion.
Basic Research Issues
In this section, we first introduce a pulsar searching scientific workflow to illustrate
the problem of temporal verification for scientific cloud workflows and further pre-
sent a generic temporal verification framework with the four basic research issues.
Motivating Example and Problem Analysis
The pulsar searching process is a typical scientific workflow which involves a large
number of data and computation intensive activities. For a typical single searching
process, the average data volume is over 4 terabytes and the average execution time is
about 23 hours on Swinburne high performance supercomputing facility. As described
in , the pulsar searching process contains hundreds of high-level workflow activi-
ties and each may contain dozens or even more computation and data intensive tasks.
For example, the data extraction and transfer sub-process may take about 1.5 hours,
and the de-dispersion activity which is to counteract the effect of interstellar medium
in the pulsar signals normally requires 13 hours. According to the research schedule, a
single searching process is required to be completed within one day, i.e. 24 hours.
However, since the average execution time of the whole process is about 23 hours and
durations of most activities are very dynamic, it is very difficult to ensure on-time
completion of such a process without effective monitoring and control mechanism.