Civil Engineering Reference
In-Depth Information
Clearly it is necessary for the system to be able to translate the virtual address of
variables into a real address in memory. This translation usually involves a complicated
bit-pattern matching called paging . The virtual store is split into segments or pages of fixed
or variable size referenced by page tables, and the supervisor program tries to “learn” from
the way in which the user accesses data in order to manage the store in a predictive way.
However, memory management can never be totally removed from the user's control. It
must always be assumed that the programmer is acting in a reasonably logical manner,
accessing array elements in sequence (by rows or columns as organised by the compiler
and the language). If the user accesses a virtual memory of 10 8 words in a random fashion
the paging requests will ensure that very little execution of the program can take place (see
e.g. Wille, 1995).
In the immediate future, “large” finite element analyses, say involving more than 1 mil-
lion unknowns, are likely to be processed by the vector and parallel processing hardware
described in the next sections. When using such hardware there is usually a considerable
time penalty if the programmer interrupts the flow of the computation to perform out-
of-memory transfers or if automatic paging occurs. Therefore, in Chapter 3 of this topic,
special strategies are described whereby large analyses can still be processed “in-memory”.
However, as problem sizes increase, there is always the risk that main memory, or fast sub-
sidiary memory (“cache”) will be exceeded with consequent deterioration of performance
on most machine architectures.
1.4 Vector processors
Early digital computers performed calculations “serially”, that is, if a thousand operations
were to be carried out, the second could not be initiated until the first had been completed,
and so on. When operations are being carried out on arrays of numbers, however, it is
perfectly possible to imagine that computations in which the result of an operation on two
array elements has no effect on an operation on another two array elements, can be carried
out simultaneously. The hardware feature by means of which this is realised in a computer
is called a pipeline , and in general, all modern computers use this feature to a greater or
lesser degree. Computers which consist of specialised hardware for pipelining are called
vector computers. The “pipelines” are of limited length and so for operations to be carried
out simultaneously it must be arranged that the relevant operands are actually in the pipeline
at the right time. Furthermore, the condition that one operation does not depend on another
must be respected. These two requirements (amongst others) mean that some care must be
taken in writing programs so that best use is made of the vector processing capacity of
many machines. It is moreover an interesting side effect that programs well structured for
vector machines will tend to run better on any machine because information tends to be
in the right place at the right time (e.g. in a special cache memory) and modern so-called
scalar computers tend to contain some vector-type hardware. In this topic, beginning at
Chapter 5, programs which “vectorise” well will be illustrated.
True vector hardware tends to be expensive and at the time of writing a much more
common way of increasing processing speed is to execute programs in parallel on many
processors. The motivation here is that the individual processors are then “standard” and
Search WWH ::




Custom Search