Chemistry Reference
In-Depth Information
t K +3 ). The corrector
The coefficients B k ensure time reversibility to O (
d
associated with this method is
C p
C
ð
t
Þ¼o
argmin
C
E KS ð
R
;
C
Þþð
1
ð
t
Þ;
(34)
o ¼
1
with
2 1 . Variations of the above two methods are possible and have been
discussed together with, for example, a noise dissipation algorithm [ 18 - 20 ] for
additional stability.
3 Faster, Larger, and More Accurate: Recent Developments
Advanced applications in AIMD require the treatment of systems with hundreds of
atoms and an extended sampling of configurations. These simulations are only
possible using large computational resources and the most advanced algorithms.
Tapping the power of massively parallel computers was therefore an important goal
in many AIMD code projects. Further important improvements in AIMD simulations
came from better sampling algorithms and the emergence of improved electronic
structure methods.
3.1 Massively Parallel Implementation
Modern high performance computer architectures are all based on massively
parallel assemblies of multi-core CPUs with high speed networks. In order to
take advantage of the available computing power it is necessary to revise and
adapt algorithms constantly and to update computer codes. The AIMD community
using plane wave based codes played a leading role in the efficient usage of parallel
computers. These codes are dominated by a small number of computational kernels,
e.g., three-dimensional fast Fourier transforms (FFT) and matrix multiplication,
which can be mapped efficiently on distributed memory architectures. Early
parallelization strategies focused either on a parallelization by bands (orbitals) or
on the 3d-FFT [ 21 , 22 ]. The FFT route proved to be more successful but mixed
schemes were also explored in order to extend the range of optimal scaling.
Efficient scaling to tens of thousands of processors was achieved with these
implementations [ 23 ]. The recent trend to multi-core systems led to compute
nodes with many compute elements. This caused an adaptation of algorithms
using multi-level parallelization [ 24 - 26 ]. Typically a coarse-grain distributed
memory level using the MPI library is used for inter-node parallelization and a
fine-grain, loop level, parallelization using OpenMP takes advantage of the com-
pute cores within the nodes. Several leading AIMD codes, e.g., CPMD [ 3 , 22 ],
Qbox [ 23 ], and Quantum ESPRESSO [ 26 ] are following this approach.
Search WWH ::




Custom Search