Civil Engineering Reference
In-Depth Information
The numbers of equations to be distributed to each processor can then be calculated
using calc_neq_pp and the freedom information using make_ggl . Distributed equation
arrays can then be allocated.
The section commented “element stiffness integration etc” down to END DO gauss_
pts_1 is identical in parallel and serial versions, but building the diagonal preconditioner
in parallel involves a scatter operation.
Information about the analysis is written on processor 1. The next section of cod-
ing has to relocate the global loading val entries to the appropriate processors, using
reindex_fixed_nodes , print out the total load and invert the preconditioner.
The section commented “preconditioned cg iterations” is the parallel equivalent of the
similarly annotated section in Program 5.5, involving gather and scatter as described in
Section 12.2.8. In a similar way the section commented “pcg equation solution” mirrors
the serial version in an obvious way. Only the centreline vertical displacement of the elastic
cuboid is printed.
Finally the stress recovery section, involving loop labelled gauss_pts_2 , uses exactly
the same coding as the serial version but the stresses are only printed for the central surface
element at the first Gauss point.
The example analysed is an elastic cube with a uniform pressure of unity on a square
patch at the centre of the cube. Data are listed as Figure 12.8 and results as Figure 12.9.
The vertical deflection is seen to be
0.03428 units and the vertical stress under the load
1.000 applied).
In all, the parallel program is about 50% longer than its serial counterpart. Two
salient aspects of performance are shown in Figures 12.10 and 12.12. The success of
iterative methods clearly depends on the number of iterations for convergence as a pro-
portion of problem size (Smith and Wang, 1998). Figure 12.10 shows that the iteration
0.999 (compared to
nels nxe nze nip
64000 40 40 8
aa bb cc e v
0.25 0.25 0.25 100.0 0.3
tol limit
1.0e-5 2000
Figure 12.8
Data for Program 12.1 example
This job ran on 32 processors
There are 270641 nodes 24161 restrained and 777520 equations
Time after setup is : 0.600000000000363798
The total load is -0.4000E+01
The number of iterations to convergence was 568
The central nodal displacement is : -0.3428E-01
The Centroid point stresses for element 1 are
Point 1
-0.7032E+00 -0.7032E+00 -0.9994E+00 0.3700E-03 0.3846E-03 0.3846E-03
This analysis took : 83.4400000000005093
Figure 12.9
Results from Program 12.1 example
Search WWH ::




Custom Search