Chemistry Reference
In-Depth Information
1000
800
600
400
200
0
0
2000
4000
6000
8000
Number of orbitals
FIGURE 8.8  Performance.of.various.LAPACK.eigenvalue.solvers.on.multiple.CPUs.using.
multithreaded. BLAS. for. gold. clusters. with. sizes. ranging. from. 13. to. 923. atoms. (117-8307.
orbitals)..The.solid.line.indicates.results.for.the.DSYGVX.routine,.while.the.dashed.line.indi-
cates.results.for.the.DSYGVR.routine..Symbols.indicate.one.processor.(circles),.two.proces-
sors.(squares),.four.processors.(triangles),.and.eight.processors.(diamonds).
required.to.link.to.a.different.library.to.take.advantage.of.multithreading..Frequently,.
multithreading.is.the.default.behavior.and.users.beneit.automatically..The.choice.of.
algorithm.is.still.important.as.mentioned.above.
Examples. of. performance. increases. due. to. multithreading. are. shown. in. Figure.
8.8..Again,.the.DSYGVR.routine.(and.similarly.DSYGVD).is.much.faster.than.the.
DSYGVX.routine..In.Figure.8.8,.we.may.also.see.how.the.routines.scale.on.mul-
tiple.processors..The.scaling.for.the.DSYGVX.algorithm.is.poor,.achieving.only.a.
30%.speedup.for.the.largest.problem.size.when.running.on.eight.processors.versus.
running.on.one.processor..On.the.other.hand,.the.DSYGVR.routine.(and.similarly.
DSYGVD).shows.much.better.scaling,.achieving.a.speedup.of.270%.on.four.proces-
sors.and.300%.when.running.on.eight.processors..The.parallel.eficiency.(speedup.
divided.by.the.number.of.processors).of.the.DSYGVR.routine.is.90%.for.jobs.with.
two.processors.and.68%.for.jobs.with.four.processors,.but.drops.to.38%.when.using.
eight.processors..This.result.shows.the.practical.limit.for.improving.the.performance.
of. solving. the. eigenvalue. problem. on. the. particular. computer. used. to. produce. the.
results.. Different. computer. architectures. will. have. different. limits. that. should. be.
explored.by.the.user.
The.above.discussion.focused.solely.on.parallelism.in.the.mathematical.library..
For. small. to. medium. size. problems,. this. approach. is. reasonable.. However,. as. the.
problem. size. increases,. the. expense. of. constructing. the. Hamiltonian. and. overlap.
matrices.increases.as.well.and.will.demand.an.increasing.fraction.of.the.total.com-
pute.time..In.this.case,.it.is.beneicial.to.parallelize.construction.of.these.matrices.as.
Search WWH ::




Custom Search