Preliminaries: Computer Strategies - Programming the Finite Element Method

Civil Engineering Reference

In-Depth Information

Clearly it is necessary for the system to be able to translate the virtual address of

variables into a real address in memory. This translation usually involves a complicated

bit-pattern matching called paging . The virtual store is split into segments or pages of fixed

or variable size referenced by page tables, and the supervisor program tries to “learn” from

the way in which the user accesses data in order to manage the store in a predictive way.

However, memory management can never be totally removed from the user's control. It

must always be assumed that the programmer is acting in a reasonably logical manner,

accessing array elements in sequence (by rows or columns as organised by the compiler

and the language). If the user accesses a virtual memory of 10 8 words in a random fashion

the paging requests will ensure that very little execution of the program can take place (see

e.g. Wille, 1995).

In the immediate future, “large” finite element analyses, say involving more than 1 mil-

lion unknowns, are likely to be processed by the vector and parallel processing hardware

described in the next sections. When using such hardware there is usually a considerable

time penalty if the programmer interrupts the flow of the computation to perform out-

of-memory transfers or if automatic paging occurs. Therefore, in Chapter 3 of this topic,

special strategies are described whereby large analyses can still be processed “in-memory”.

However, as problem sizes increase, there is always the risk that main memory, or fast sub-

sidiary memory (“cache”) will be exceeded with consequent deterioration of performance

on most machine architectures.

1.4 Vector processors

Early digital computers performed calculations “serially”, that is, if a thousand operations

were to be carried out, the second could not be initiated until the first had been completed,

and so on. When operations are being carried out on arrays of numbers, however, it is

perfectly possible to imagine that computations in which the result of an operation on two

array elements has no effect on an operation on another two array elements, can be carried

out simultaneously. The hardware feature by means of which this is realised in a computer

is called a pipeline , and in general, all modern computers use this feature to a greater or

lesser degree. Computers which consist of specialised hardware for pipelining are called

vector computers. The “pipelines” are of limited length and so for operations to be carried

out simultaneously it must be arranged that the relevant operands are actually in the pipeline

at the right time. Furthermore, the condition that one operation does not depend on another

must be respected. These two requirements (amongst others) mean that some care must be

taken in writing programs so that best use is made of the vector processing capacity of

many machines. It is moreover an interesting side effect that programs well structured for

vector machines will tend to run better on any machine because information tends to be

in the right place at the right time (e.g. in a special cache memory) and modern so-called

scalar computers tend to contain some vector-type hardware. In this topic, beginning at

Chapter 5, programs which “vectorise” well will be illustrated.

True vector hardware tends to be expensive and at the time of writing a much more

common way of increasing processing speed is to execute programs in parallel on many

processors. The motivation here is that the individual processors are then “standard” and

Search WWH ::

Custom Search

Home