Introduction to High-Performance Computing Using MPI - High-Throughput Image Reconstruction and Analysis

Biomedical Engineering Reference

In-Depth Information

carrying out the ''real'' computation, and minimize the time spent in overheads of

coordination, message exchanges, and waiting for intermediate results from other

processors. This problem is exacerbated by the fact that different parallel systems

have different architectures (vector versus scalar processors, shared memory versus

distributed memory, single core versus multi-core) and architectural parameters,

such as number of processors, amount of memory per processor, and network

bandwidth. Ideally one should be able to write parallel programs that are

independent of such system specific details.

This situation is analogous to the old days of programming a uniprocessor

computer system in machine or assembly language, where the programmer was re-

quired to keep track of contents of the specific machine registers, perform machine-

specific operations on them, and move the data across registers or main memory,

while maintaining a consistent view of the data and the program. The evolution

of high-level languages such as COBOL, FORTRAN, C, C++, and Java has given

a tremendous boost to the programmability of computer systems by providing an

abstract model of a computer system which exposes all its essential features that

are common across a variety of computer architectures and generations. As a re-

sult, the programmer can focus on only encoding the algorithm using a generic

high-level machine independent language, without worrying about the daunting

architectural details of the specific computer system on which the program is to be

run. The task of converting the high-level representation of an algorithm (in the

form of a program) to a low-level machine and architecture-specific form has been

relegated to the compilers for the machine. In addition to the ease of programming,

the high-level languages also allow users to run a program on different computers

of different architectures and generations with little effort. For running a pro-

gram on a new system, one simply needs to recompile it using the system-specific

compiler.

5.3.1 The Three P's of a Parallel Programming Model

The goal of a parallel programming model, in the same vein, is to expose the

essential elements of a parallel system in an abstract manner, making it easier for

a programmer to code a parallel algorithm into a high-level parallel program using

the primitives and abstractions provided by the model. Such a program can then be

translated into a low-level representation for execution on the parallel system. Such

a model must be intuitive, easy to understand, and must make the task of coding

the parallel algorithm into a high-level parallel program as simple as possible.

Moreover, it should hide the unnecessary details of the underlying parallel system,

so that a programmer can focus exclusively on the task of mapping a parallel

algorithm to the abstract parallel programming model, without having to worry

about the details of the specific system on which the program has to be run.

All such abstractions however, come at a cost. There is an inherent trade-off

between the levels of abstraction and the program performance. Ideally, one should

not even have to bother writing a parallel program. One should just be able to write

a sequential program in a high-level language and a parallelizing compiler should

take care of the task of parallelizing the program. While this may be possible for

some specific applications running on specific systems, writing a general-purpose

High-Throughput Image Reconstruction and Analysis

Search WWH ::

Custom Search

Home