Commercial Use of WS-PGRADE/gUSE - Science Gateways for Distributed Computing Infrastructures

Information Technology Reference

In-Depth Information

Based on the legal background and business needs, metadata and labels are

automatically generated and assigned to the content:

labeling: retrieving and assigning the most relevant expressions (results of

ranking) as a

“

tag cloud

”

to the processed document,

metadating: the construction of XML-based (Management Information

Resources for eGovernment, MIReG) metadata structure based on retrieved

metadata of processed document.

The cryptographic module is invoked to create an electronic signature on the

whole data package. This functionality is required by the legislation in order these

digitized documents are acceptable in a legal mean:

digital signing: creation of XML-based (XML Advanced Electronic Signatures,

XadES) electronic signature on a processed document and its related outputs by

using a cryptographic private key (e.g. RSA), retrieving timestamp and revo-

cation information (e.g. Certi

cate Revocation List, CRL), or Online Certi

cate

Status Protocol, OCSP) response.

The eDOX Archiver Gateway as a commercial product manages user accounts

and supports clearing functionality. The integration of a payment solution is not in

the scope of the product. There are several methods for credit top-up of a user

account (e.g., money transfer to a central bank account via netbanking solution), but

these methods shall be supported by the environment:

accounting: calculating accountable values based on the real usage of the service

(e.g., processed page per job).

The output electronic data package is in a legal sense equivalent to the input

paper-based document.

19.3.3 Usage of the eDOX Acrhiver Gateway

During workflow design, the execution statistics of each step were analyzed, as

shown in Table 19.1 . Based on these statistics, computation-intensive and paral-

lelizable functions, such as recognizing characters (OCRing) and

finding the roots

of the words (indexing), were mapped into the cloud in order to reduce execution

time of these jobs.

The eDOX Archiver Gateway is a commercial product, where users can set

priorities and decide the level of parallelization based on the precalculated cost of

allocating cloud resources and executing the jobs. In principal, the best throughput

time could be achieved by allocating a separate cloud resource to each document

page. However, in practice this could be different due to the default boot and setup

time (

10 min) of virtual machines. For example, in the case of a larger topic that

has 830 pages, it is possible to dedicate a virtual machine for each page for

OCRing,

*

finding the roots and indexing. In this case, processing time is measured at

Science Gateways for Distributed Computing Infrastructures

Search WWH ::

Custom Search

Home