Moldable Job Allocation for Handling Resource Fragmentation in Computational Grid - Cloud, Grid and High Performance Computing: Emerging Applications

Information Technology Reference

In-Depth Information

relative computing speeds of all the five sites in

the Grid, in which the value 1 represents the

computing speed resulting in the job execution

time in the original workload log. We also define

a load vector, e.g. load=(ld1,ld2,ld3,ld4,ld5),

which is used to derive different loading levels

from the original workload data by multiplying

the load value ld i to the execution times of all jobs

at site i .

time. Therefore, it is not straightforward to know

how raising a single job's parallelism would affect

the overall system-level performance, e.g. the

average turnaround time of all jobs. On the other

hand, reducing a job's parallelism might shorten

its waiting time in queue at the cost of enlarged

execution time. It is not always clear whether the

combined effects of shortened waiting time and

enlarged execution time would lead to a reduced

or increased overall turnaround time. Moreover,

the reduced parallelism of a job would usually in

turn result in the decreased waiting time of other

jobs. This makes it even more complex to analyze

the overall effects on system performance.

The above examples illustrate that the effects

of the idea of moldable job allocation on overall

system performance is complex and require further

evaluation. In our previous work (Huang, 2006)

we proposed two possible adaptive processor al-

location policies. In this paper, we improve the two

policies by requiring users to provide estimated

job execution time upon job submission, just like

what is required by the backfilling algorithms. The

estimated job execution time is used to help the

system determine whether to dynamically scale

down a job's parallelism for immediate execution,

i.e. shorter waiting time, at the cost of longer

execution time or to keep it waiting in queue for

the required amount of processors to become

available. This section explores and evaluates the

two improved moldable job allocation policies,

which take advantage of the moldable property

of parallel applications, on homogeneous paral-

lel computers. The three allocation policies to be

evaluated are described in detail in the following.

MOLDABLE JOB ALLOCATION

ON HOMOGENEOUS

PARALLEL COMPUTER

Moldable job allocation takes advantage of the

moldable property of parallel applications to

improve the overall system performance. For

example, an intuitive idea is allowing a job to use

a less number of processors than originally speci-

fied for immediate execution if at that moment the

system has not enough free processors; otherwise

the job has to wait in a queue for an uncertain

period of time. On the other hand, if the system

has more free processors than a job's original

requirement, the system might let the job to run

with more processors than originally required to

shorten its execution time. This is called moldable

job allocation in this paper. Therefore, the system

can dynamically determine the runtime parallelism

of a job before its execution through moldable job

allocation to improve system utilization or reduce

the job's waiting time in queue.

For a specific job, intuitively we know that

allowing higher parallelism can lead to shorter

execution time. However, when the overall system

performance is concerned, the positive effects of

raising a job's parallelism can not be so assured

under the complex system behavior. For example,

although raising a job's parallelism can reduce

its required execution time, it might, however,

increase other jobs' probability of having to wait

in queue for longer time. This would increase

those jobs' waiting time and in turn turnaround

•

No adaptive scaling. This policy allocates

the number of processors to each parallel

job exactly according to its specified re-

quirement. The policy is used in this sec-

tion as the performance basis for evaluat-

ing the moldable job allocation policies.

•

Adaptive scaling down. If a parallel job

specifies an amount of processors which

Cloud, Grid and High Performance Computing: Emerging Applications

Search WWH ::

Custom Search

Home