Information Technology Reference
In-Depth Information
facto standard for user and host authentication
on the Grid. GSI is used by most mature Grid
middleware implementations. Shortcomings of
this infrastructure are described later in this paper;
here we introduce the basic GSI infrastructure.
GSI essentially comprises a Public Key Infra-
structure (PKI) that is used to sign user identity and
host certificates. Users can create limited-lifetime
Proxy certificates which allow them to send cre-
dentials with their jobs for authentication, without
the risk of compromising the user's private key.
Proxy certificates are used for all transactions by a
job, such as gridFTP transactions. We here assume
that all authorization decisions with regard to data
are based on GSI user authentication by means
of Proxy certificates. Other approaches (such as
role-based or attribute-based authorization, as
proposed in (Alfieri et al., 2004) are possible,
but not required for our framework. Many Grid
infrastructures manage access control to resources
and storage based on virtual organization (VO)
membership information. However, VO-based
authorization is often too course-grained for pro-
tecting medical information: there may be many
users (e.g., researchers) in a VO, which may not
all be equally trusted to access particular data.
Therefore, we assume authorization based on user
identities in this paper.
of a risk assessment when decisions are made on
which sites are trusted to store or access particular
information.
Given legal constraints, trust decisions will and
should be conservative. For example, unencrypted
data, file names, and other sensitive metadata
should only be stored in trusted domains, e.g., in
the hospital. This aspect is even more prevalent
in systems where jobs on remote machines can
access medical data. Current OSs such as Linux
provide little assurance that information stored
on the system cannot be leaked to external parties
(van 't Noordende, Balogh, Hofman, Brazier and
Tanenbaum, 2007).
Even if files are removed after the job exits
(e.g., temporarily created files), the contents could
be readable by administrators or possibly attackers
while the job executes. Furthermore, disks may
contain left-over information from a job's previ-
ous execution, which is readable by an attacker
who gains physical access to a storage device, if
the system is not properly configured (NIST). As
another example, it is possible to encrypt swap
space in a safe way, but this is an option that has to
be explicitly enabled in the OS. For these reasons,
it is important for a data owner to identify critical
aspects of the administration and configuration
of a remote host, before shipping data to (a job
running on) that host.
Another problem is that a data owner cannot
control nor know the trajectory that a job took
before it was scheduled on a host, since this is
implicit and hidden in current Grid middleware.
Therefore, even if the host from which a job ac-
cesses data is trusted by the data owner, there
is a risk that the job was manipulated on some
earlier host.
Current middleware does not provide a way
to securely bind jobs to Proxy certificates: a cer-
tificate or private key bundled with a program
can easily be extracted and coupled to another
program which pretends to be the original program.
In Grids, this issue is exacerbated by the fact that
a job may traverse several middleware processes
PROBLEM ANALYSIS
Grids are, by nature, distributed across multiple
administrative domains, only a few of which
may be trusted by a specific data owner. Grid
middleware, and thus jobs, typically run on an
operating system (OS), such as Linux, that al-
lows administrators to access all information on
the system. A job or data owner does not have
control over the hardware or software that runs on
some remote system. Besides OS and middleware
vulnerabilities, these systems might also not be
well protected against physical attacks, such as
stealing hard disks. Such aspects should be part
Search WWH ::




Custom Search