Introduction - User-Level Workflow Design: A Bioinformatics Perspective

Information Technology Reference

In-Depth Information

The researchers that plan the experiments are experts of a certain domain,

not skilled programmers. Thus, a helpful workflow management system for

scientific research must support intuitive and rapid process design, ideally on

a level of abstraction that focuses on defining essential aspects of the workflow

(such as the execution order of services, input data, and transition conditions

[166]) rather than on technical and syntactical details of a (standard) pro-

gramming language. This implies a software architecture that separates the

engine that deals with the heterogeneous services from the interface that

presents a unified world to the end user, ideally by means of a comprehensive

and intuitive graphical user interface that makes the system easy to use and

facilitates agile workflow development. Graphical workflow representations

are furthermore advantageous as they directly provide intuitive documenta-

tion of the computational experiment [128].

Requirement 2: powerful workflow model

A powerful workflow model for scientific applications, that is, a model which is

fit for implementing complex analysis procedures, has to comprise functional-

ity for control-flow handling, data handling, and hierarchical modeling [128].

•

Requirement 2.1: control-flow handling

The ability to define basic aspects like the order of workflow steps and

transition conditions between them is of course essential, ideally also syn-

chronization requirements should be expressible [166]. Furthermore, con-

trol flow structures like conditional branching and conditional loops are

required when building complex workflows. As discussed in [128], pure

“for each element”-loops, which allow for iteration over the elements of a

list, are in general not sucient.

•

Requirement 2.2: data handling

Typically realizing data analysis processes, scientific workflows rely on

input data from which they finally derive (a set of) results, usually also

producing a number of intermediate or partial results [166]. Hence, sci-

entific workflow systems have to provide adequate support for defining

the flow of data between the services in the workflow. Technically, there

are basically two possibilities for passing data between the single services

within the workflow: transferring the data via “pipelines” from one service

to another (messaging approach) or assigning identifiers to all data items

so that the services access them via named variables (shared-memory

approach).

As for the supported data items themselves, it is mandatory to support

primitive data types and (one-dimensional) lists [128], but also complex

data types and multi-dimensional lists are frequently needed. In order

to make it easy for the user to utilize the data handling capabilities of

the workflow model, workflow systems should furthermore support basic

data processing functionality (such as arithmetic operations, string ma-

nipulations and sub-data access methods [128], but also means for data

User-Level Workflow Design: A Bioinformatics Perspective

Search WWH ::

Custom Search

Home