Information Technology Reference
In-Depth Information
have an idea about the computational complexity. Its execution required hours
andevendaysforprocessing.
Three key elements were used to improve the processing time of the applica-
tion as follows:
- The programming language
- Optimization of the source code
- Use of parallel algorithms and architectures
4.1 The Programming Language
During the early stages of the project development, to overcome the higher ex-
ecution times of using Mathematica, the algorithm was partially implemented
in different languages such as R 4 and Python 5 . They are two widely used lan-
guages in many scientific computing applications but unfortunately, they are
slower than other languages such as C and Fortran . Despite that Fortran is an
older language, its performance is still one of the best and is currently used in
many HPC applications. Besides, Fortran is suitable for working in distributed
memory parallel architectures and is integrated into many Linux distributions
using GNU Fortran compiler. For this reason, GNU Fortran 4.1 , which is based
on Fortran 95, was used to implement QDsim.
4.2 Optimization of the Source Code
The performance of a program is affected by the algorithm complexity (i.e.,
number of loops and computational operations in a code) and the location of
data in the memory hierarchy. For these reasons, the number of variables and
operations should be limited, recycled and controlled. This approach avoids,
on one hand memory paging 6 , which causes high penalties in execution perfor-
mance; on the other hand it is mandatory to avoid any unnecessary operations.
For example, (1) is composed of many factors, which depending upon system
parameters, sometimes yield multiplied by zero at runtime. Unfortunately, nei-
ther the compiler (in spite of the optimization flags) nor the Control Unit 7 are
aware of finding an ecient way of minimizing such operations. Being unaware
of this, the program wastes valuable clock cycles in a fruitless manner. Thus, the
proposed algorithm avoids those redundant operations to increase significantly
its performance. From (1), the elements of the density matrix are obtained and
represented (in a compact manner) as a set of i terms [3], where each term is
composed of a set of a j factorsasshownin(5).
a j,i
dt ˁ k =
i
F ( t, ˁ k )= d
·
ˁ i .
(5)
j
4 See: The R Project for Statistical Computing, http://www.r-project.org
5 See: http://www.phyton.org
6 Reading from the lower levels of the memory hierarchy instead of doing so from the
higher ones.
7 Internal circuitry that allows the operations inside the CPU and the data flow.
Search WWH ::




Custom Search