Civil Engineering Reference
In-Depth Information
b) A surprising amount of time was spent in the Fortran 95 intrinsic MAXVAL (for testing
convergence in subroutine checon ).
c) The most time-consuming operation is the Fortran 95 intrinsic MATMUL , and on
the particular vector computer, it was running considerably slower than the peak
machine speed.
Program 5.6 addresses all of these issues. First, unless freedoms are “tied” together (a
device not used in this topic) we can be sure that entries in g are not duplicated and so the
scatter operation can be vectorised. A “compiler directive” ( !dir$ ivdep in this case)
is therefore inserted before the loop elements 2a: enabling the loop to be vectorised.
Second, MAXVAL is replaced by its longhand equivalent. This is obviously a problem with
the particular vendor whose implementation of MAXVAL could be much improved. Third,
and this is probably also a vendor problem, the MATMUL operation is changed from matrix-
vector to matrix-matrix by collecting all the p(g) vectors into a global matrix g pmul in
the loop elements 2: . Otherwise the program is the same as Program 5.5 and of course
produces the same results. However, Table 5.1 shows the progressive effects of making
changes to the coding in Program 5.5.
Table 5.1 Timings of vectorised programs
Original code (Program 5.5)
44.7 seconds
No dependency
25.3 seconds
Replace MAXVAL
21.6 seconds
Matrix-matrix (Program 5.6)
9.3 seconds
The speed-up of Program 5.6 over Program 5.5 on this particular vector computer
was by a factor of about 5, and illustrates the importance of code analysis when using
such machines.
Glossary of variable names used in Chapter 5
Scalar integers:
cg iters pcg iteration counter
cg limit pcg iteration ceiling
fixed freedoms number of fixed displacements
i
simple counter
simple counter
iel
1 for “symmetry”,
1 for “antisymmetry”
iflag
iwp
SELECTED REAL KIND(15)
simple counter
k
number of loaded nodes
loaded nodes
harmonic on which loads are to be applied
lth
number of dimensions
ndim
ndof
number of degrees of freedom per element
nels
number of elements
neq
number of degrees of freedom in the mesh
Search WWH ::




Custom Search