Information Technology Reference
In-Depth Information
be completely automated and seamlessly integrated into the overall anal-
ysis process, which leads to much faster and significantly less error-prone
execution of standard data processing tasks. Furthermore, a large number
of repeated runs of the same experiment may reveal information about
possible experimental errors.
Scientific workflow systems (cf., e.g., [27, 106, 355] for surveys) support
and automate the execution of error-prone, repetitive tasks such as data
access, transformation, and analysis. In contrast to manual execution of com-
putational experiments (i.e., manually invoking the single steps one after
another), creating and running workflows from services increases the speed
and reliability of the experiments:
Workflows accelerate analysis processes significantly. The difference be-
tween manual and automatic execution time increases with the complex-
ity of the workflow and with the amount of data to which it is applied. As
manual analyses require the full attention of an (experienced) human user,
they are furthermore expensive in the sense that they can easily consume
a considerable amount of man power. For instance, assume a workflow
that needs 5 minutes to perform an analysis which requires 20 minutes
when carried out manually. When applied to 100 data sets, it runs for
8:20 h, while the human user would be occupied for 33:20 h, which cor-
responds to almost a man-week of work. What is more, the automatic
analysis workflows run autonomously in the background, possibly also
over night, so that the researcher can deliberately focus on other tasks in
the meantime.
Workflows achieve a consistent analysis process. By applying the same
parameters to each data set, they directly produce comparable and repro-
ducible results. Such consistency can not be guaranteed when the analysis
is carried out repeatedly by a human user, who naturally gets tired and
inattentive when performing the same steps again and again. When the
analyses are carried out by different people, the situation gets even worse,
as achieving consistent behavior of different users is even more dicult to
achieve.
Focusing on their actual technical realization, [193] describes the design
and development of scientific workflows in terms of a five-phase life cycle (cf.
Figure 1.5): Starting point is the formulation of a scientific hypothesis that
has to be tested or specific experimental goals that have to be reached. In
the subsequent workflow design phase the corresponding workflow is shaped.
Services and data must be put together to form a workflow that addresses
the identified research problem. It is also possible that (parts of) existing
workflows can be re-used or adapted to meet the needs of the new workflow.
The workflow preparation phase is then concerned with the more technical
preparations (like, e.g., specific parameter settings or data bindings) which
are the prerequisites for workflow enactment in the execution phase. Finally
there is a so-called post-execution analysis phase, meaning the inspection and
 
Search WWH ::




Custom Search