A Policy-Based Security Framework for Privacy-Enhancing Data Access and Usage Control in Grids - Cloud, Grid and High Performance Computing: Emerging Applications

Information Technology Reference

In-Depth Information

Grid job, the user cannot only provide input data to

a pre-defined service provided by an SP. Instead,

the user lets own program code make use of the

CPU and storage capacities provided by the SPs

that are involved in the Grid.

This immediately leads to the consequence

for privacy and data protection in Grids that any

data related to a user's Grid job must be treated

similarly to the user's PII:

•

On the SP side, the considerations for the

input data must also be applied to the Grid

job's output data. Depending on the Grid

job, the output data may be even more

sensitive than the input data. As an ex-

ample, consider data mining on medical

data which derives a set of potentially ter-

minally ill patients. Thus, there must be an

agreement about how the output data must

be treated, both while the Grid job is run-

ning and after it has finished. This affects,

for example, whether the output data has

to be deleted from the service provider's

systems after the user has retrieved it, or

whether it should be kept, e.g., as input

data for a subsequently submitted follow-

up Grid job.

•

The Grid job's code, independent of wheth-

er it is being distributed in source or binary

format, should be considered intellectual

property of the Grid user. Especially in

commercial Grid environments it must ob-

viously be avoided that program code sub-

mitted by one user is redistributed by the

service provider or made available to other

users. However, this also affects whether

an SP may modify the program code, e.g.,

in order to optimize it for the local comput-

ing architecture.

Additional aspects, such as whether the SP is

allowed to backup or even archive these Grid job

components, must also be taken into consideration.

As an obvious resulting requirement, services

which are shared by multiple or all organizations in

the Grid, such as globally distributed file systems,

must provide sufficient access control mechanisms

to prevent organizations, which are not involved in

a particular Grid job, from accessing its code, input

data, and output data to achieve confidentiality

and a separation of concerns on an organizational

level (see also Cunsolo, Distefano, Puliafito, and

Scarpa (2010)). In this context, it should be noted

that encryption of input and output data would

hardly increase security, as long as a potentially

malicious SP runs the Grid job and thus gains

access to the data in clear text.

Privacy and data protection settings may also

vary with each Grid job, independent of the us-

ers' preferences regarding their own PII. As a

consequence, the logical separation between PII

and Grid job privacy management must be ac-

counted for. This is not only relevant for Grid job

execution engines, but also, e.g., for the design

of (graphical) user interfaces.

•

Input data for the Grid job may contain

sensitive data, e.g., when Grid-based data

mining is performed on large sets of medi-

cal data. In this case, both the Grid user

and the SP share a couple of responsibili-

ties. On the one hand, the Grid user must

have the permission to submit the data to

the SP; this is a non-trivial organizational

task because the utilized Grid service pro-

viders are, in general, unknown at the point

in time when the input data is being col-

lected. On the other hand, the SP to which

a Grid job has been submitted is typically

not allowed to make any use of the input

data other than feeding it into the Grid

job's code. Thus, similarly to the handling

of PII, the user and the SP must agree on a

set of purposes for which the data may be

used. Obviously, there must be technical

means to enforce this binding.

Cloud, Grid and High Performance Computing: Emerging Applications

Search WWH ::

Custom Search

Home