Information Technology Reference
In-Depth Information
Scheduler
specified destinations during runtime . Therefore,
instead of dividing the execution of the jobs into
three separate phases (download, run, upload),
the execution of all steps is attempted at the same
time: the input data is provided parallel to the
running of the job.
The algorithm of the dynamic data feeder
scheduler is similar to the algorithm of the static
data feeder scheduler with two differences:
The Portal runs a Java applet in the user's browser
which communicates with the server layer. In order
to implement the proposed components we needed
to extend both the rich client and the server layer.
On the extended Portal interface the user can
specify which scheduler algorithm should be used
by the system. If our scheduler is selected the user
also has to provide the job behaviour description.
Because the P-GRADE portal does not al-
low querying the size of input files directly, the
implemented scheduler cannot consider it when
estimating the finishing time of a job on a CE.
Instead, the absolute file sizes contained by the
job behaviour description are used. Moreover,
the scheduler does not know the length of the
wait queues of the CEs, therefore the maximum
job running time estimates are used, which are
specified by the job submitters.
1. The estimated job execution time takes
into account that the relevant parts of the
necessary files may be delivered after the
job is started (but before the job would
access them). Therefore the calculation
fileTransferTime ( c , d ) includes only the pre-
run and post-run file transfer times, it does
not include the transfer time of file segments
that are copied parallel to the running of the
job.
2. Replication commands are generated that
allow the relevant file segments being copied
parallel to the running of the job.
Description Generator
The Description generator is implemented by a
shared library, which monitors the resource access
activity of jobs and prepares the job descriptions
by analysing the pattern of activities.
File access monitoring is based on the intercep-
tion of standard file handling operations defined
in the stdio.h , fcntl.h and unistd.h libraries. In
general, for a given file operation, the name of the
operation, the file or stream descriptor, the name
of the file, the opening mode flags, the amount
of data read or written, or the new position in the
stream are considered.
CPU usage information is collected between
two consecutive file access operations. The /
proc - process information pseudo-filesystem
(LinuxForum: Linux Filesystem Hierarchy, 1.10.
/proc) - is used to access the kernel data struc-
tures containing the necessary CPU consumption
information.
Because the component (for administrative
reasons) cannot be deployed to all computers of
Please note that compared to the static data
feeder strategy, the estimated completion time of
a given job will be lower in most cases.
IMPLEMENTATION
The proposed architecture cannot be deployed
completely in existing “production” Grid en-
vironments. Lack of administrative/authoritive
credentials and missing services are among the
most important reasons. We have chosen to extend
the P-GRADE portal (P-GRADE portal) with our
proposed components as it allowed us to imple-
ment an adopted version of the static scheduler.
P-GRADE is a parallel application development
system for Grid, which (among others) implements
job scheduling, migration and checkpointing. P-
GRADE supports the Globus (Globus Toolkit)
and Condor(Condor Project) Grid environments.
Search WWH ::




Custom Search