Workflow Concept of WS-PGRADE/gUSE - Science Gateways for Distributed Computing Infrastructures

Information Technology Reference

In-Depth Information

problem by using unique channels among source and sink jobs. This latter pattern is

supported by gUSE.

The multiple instance task (TMIT) pattern and from multiple instance task

(FMIT) pattern (Fig. 3.1 b, c respectively) specify coherent interpretation methods;

therefore they are usually supported in pairs. Both patterns are de

ned among two

connected tasks. While TMIT covers the situation of de

ning the data transfer if the

subsequent job is going to be executed in multiple instances in parallel, FMIT

focuses on the case when multiple jobs precede the single job. TMIT has three

subpoints depending on the data partitioning and their access: (1) shared data

accessible by references, (2) instance-speci

c data accessible by value or (3)

instance-speci

c data accessible by reference.

A gUSE dataflow requires and generates data as

files, which leads to the con-

clusion

is supported

by gUSE via the concept of generator port types described in detail in Sect. 3.6.1 .

Nevertheless, in speci

“

TMIT with instance-speci

c data accessible by value pattern

”

c cases, when remote data storage systems are used, the

other patterns are supported as well, meaning that access to remote data for

manipulation means downloading a local copy of it. Therefore, the data manipu-

lation does not take effect straight away on the shared data item, postponing, but not

resolving consistency issues. The FMIT pattern is implemented using the concept

of collector ports in gUSE.

Similarly to the previous patterns, the next two patterns are symmetric and are

mostly implemented in pairs. In general, they are based on a modi

nition

that allows the nodes to represent workflows as well. In this point of view workflows

can be used as subworkflows triggered by a job submission that covers the sub-

workflow in the outer workflow

ed job de

nition sub-

workflows are the same as normal workflows. These patterns are allow to specify data

transfer between the representing node and the subworkflow, and vice versa.

Block task to subworkflow decomposition (BTSWD) speci

'

s point of view. Nevertheless, by de

es transferring data

into a subworkflow, while subworkflow decomposition to block task (SWDBT)

speci

es the opposite direction. Both are supported by the concept of templates in

gUSE (Fig. 3.1 d), introduced and detailed in Sect. 3.6.5 .

The class of data transfer patterns contains patterns that focus on the different

types of data transfer among the nodes. The pattern named data transformation

—

input/output describes the possibility to transform the incoming data before pro-

cessed by the application, or to transform the data generated after the execution of

the application. gUSE supports these patterns implicitly. Instead of simply exe-

cuting the applications, a wrapper script is executed to set up the right environment.

It copies the input

files, manages the execution of the required application, and then

handles the generated outputs according to the type of output channel. Thus, it

sends generated

files back or stores them remotely and uses references considering

the number of data sets in the case of generator port. The data transfer by reference

—

files are stored remotely. In this case,

just a reference is retrieved back to the portal. Since it does not deal with syn-

chronization, the consistency of the remote data cannot be guaranteed; as a result,

the latter modi

unlocked pattern is supported if the output

cations overwrite the former ones.

Science Gateways for Distributed Computing Infrastructures

Search WWH ::

Custom Search

Home