Information Technology Reference
In-Depth Information
(the number varied during the overall execution
time). Given the number of parallel jobs used in
the Grid, the cluster performed better. However,
if the number of jobs is increased on the Grid,
the cluster can actually be out-performed. Note
that, the Grid response time comprises the se-
quential run time that is necessary to determine
the number of tasks and jobs and compute the
test for global congruence that is then used as
input data for the nz individual tests. This initial
part of the analysis also needs to be executed
sequentially on the cluster. In addition, the Grid
response time also includes the job submission
overhead that is imposed by the gLite workload
management systems. In order to avoid congestion
problems at the submission server, only a certain
number of jobs are submitted at a given time by
AxParafit.pl. The actual processing time of the
2,048 AxParafit tests can then be better compared
to the cluster performance. Another interesting
observation is the average processing time of 13.3
min per single task on the Grid compared to the
local execution time of 11 min on the Grid client
machines. This indicates that distinct Computing
Elements have CPUs with rather different CPU
speeds and latencies.
Overall, our Grid-based approach requires
computing times that are in the same order of
magnitude as those of a dedicated cluster. Con-
sequently, the gridified version provides an eas-
ier to use alternative to a compute cluster with
comparable performance.
CONCLUSION
We have demonstrated how a compute-intensive
application for a statistical test of congruence
between host and parasite phylogenies can effi-
ciently be distributed on the Grid. The proposed
Grid-based implementation can greatly contribute
to the reduction of response times for large-scale
analyses and to the computation of a larger number
of test permutations, which in turn improve upon
accuracy. Moreover, we have integrated the access
to Grid resources into an easy-to-use Graphical
User Interface (CopyCat) which entirely hides the
technical details related to the exploitation of Grid
Figure 5. Performance comparison of a 128 CPU cluster with a Grid using between 90 and 175 parallel
jobs. Note that the number of Grid job varied and was never constant
Search WWH ::




Custom Search