Database Reference
In-Depth Information
can be executed. During execution, a workflow system may record provenance
information (data and process history, see Chapter 12 and Section 13.5) as
well as provide real-time monitoring and failover functions. Depending on
the system, provenance information generally involves the recording of the
steps that were invoked during workflow execution, the data consumed and
produced by each step, a set of data dependencies stating which data was
used to derive other data, the parameter settings used for each step, and so
on. If a workflow can change while executing (e.g., due to changing resource
availability), the evolution of such a dynamic workflow may be recorded as
well in order to support subsequent execution analysis.
Workflow Execution Analysis. After workflow execution, scientists of-
ten need to inspect and interpret workflow results. This involves evaluation of
data products ( does this result make sense? ), examination of workflow execu-
tion traces ( is this how the result should have arisen? ), workflow debugging
( what went wrong here? ), and performance analysis ( why did this take so
long? ).
Workflow and Result Sharing. Data and workflow products can be
published and shared. As workflows and data products are committed to a
shared repository, new iterations of the workflow life cycle can begin.
User Roles. Users of scientific workflow systems can play a number of dif-
ferent roles within the above phases: A workflow designer is usually a scientist
who develops a new experimental or analytical protocol (or a new variant of
an existing method). As mentioned above, a workflow design is often elicited
through some form of requirements analysis, and the design and associated
requirements can be used by a workflow engineer to implement the associ-
ated abstract or executable workflow description. A workflow operator is a
user who executes workflows on the desired inputs. An operator may launch
a workflow directly via a scientific workflow system, or indirectly through
another application (e.g., within a Web portal), monitor the execution (e.g.,
via a workflow dashboard), and subsequently validate results based on stored
provenance information. The above user roles are not necessarily disjoint; for
example, a single person may assume the roles of designer, engineer, and op-
erator. Indeed, scientific workflow systems aim at making workflow design,
execution, and result analysis all easier in comparison to traditional script-
based approaches to scientific process automation.
13.2.2 Types of Scientific Workflows
Scientific workflows can be used to model and automate scientific processes
from many different science domains (for example, particle physics, bioin-
formatics, ecology, and cosmology, to name a few). Not surprisingly, such
workflows can exhibit very different characteristics. For example, workflows
might be exploratory in nature, starting from ad hoc designs and then re-
quiring frequent changes to the workflow design, parameter settings, etc., to
Search WWH ::




Custom Search