Chemistry Reference
In-Depth Information
1000
800
600
400
200
0
0
2000
4000
6000
8000
Number of orbitals
FIGURE 8.7  Performance.of.various.LAPACK.eigenvalue.solvers.on.a.single.CPU.for.gold.
clusters.with.sizes.ranging.from.13.to.923.atoms.(117-8307.orbitals)..The.solid.line.indicates.
results. for. the. DSYGVX. routine,. while. the. dashed. line. indicates. results. for. the. DSYGVR.
routine.
CPU.core).with.shared.memory.and.controlled.by.a.single.operating.system..Such.
machines. are. quite. common. from. laptops. to. supercomputers.. Parallel. machines.
consist.of.more.than.one.node,.where.a.node.contains.one.or.more.CPUs.with.its.
own.memory.and.operating.system..(This.is.a.rather.simpliied.view.of.the.diver-
sity. of. computing. resources. currently. available,. but. it. conveniently. categorizes.
the.vast.majority.of.the.computers.currently.in.use.).In.this.section,.we.will.focus.
primarily.on.SMP.resources..Many.of.the.results.will.apply.equally.well.to.larger.
parallel.machines..However,.performance.on.these.machines.is.highly.dependent.
on. the. technology. used. to. connect. the. nodes. to. one. another. for. sharing. data.. In.
general,. a. high. bandwidth,. low. latency. interconnection. is. desirable.. Further. dis-
cussion. of. these. architectures. and. their. performance. is. outside. the. scope. of. this.
chapter.
We.will.consider.two.means.of.achieving.parallel.speedup.on.SMP.architectures:.
multithreading. and. message. passing.. The. multithreading. paradigm. is. applicable.
only.to.SMP.architectures,.whereas.the.message.passing.paradigm.can.be.applied.on.
SMP.machines.or.on.parallel.machines.
Modern,. optimized. implementations. of. the. BLAS. include. the. capability. to. use.
more.than.one.“thread”.at.a.time,.where.a.thread.represents.an.independent.com-
pute. process,. hence. the. term. multithreading.. Often. the. number. of. threads. used. is.
equal.to.the.number.of.processors.on.the.machine.to.achieve.optimal.performance..
Optimized,. multithreaded. BLAS. libraries. typically. yield. very. good. performance.
with. a. concomitant. impact. on. the. performance. of. the. eigenvalue. solver.. This. is.
a. .particularly. easy. means. to. achieve. parallel. speedup,. as. usually. the. user. is. only.
Search WWH ::




Custom Search