Biomedical Engineering Reference
In-Depth Information
at the n th discrete time step we start the iterations by setting u n (0)
i,j,k
u n− 1
=
i,j,k ,
i
=
i l ,...,i r , j
=
j l ,...,j r , k
=
k l ,...,k r , and in every iteration l
=1 ,...
and for every i = i l ,...,i r , j = j l ,...,j r , k = k l ,...,k r
the following two-step
procedure is used:
= a i,j,k u n ( l )
i− 1 ,j,k + a i,j,k u n ( l− 1)
i +1 ,j,k + a i,j,k u n ( l )
Y
i,j− 1 ,k +
a i,j,k u n ( l− 1)
i,j +1 ,k + a i,j,k u n ( l )
i,j,k− 1 + a i,j,k u n ( l− 1)
i,j,k +1 + b i,j,k ) /a i,j,k (27)
u n ( l )
i,j,k
u n ( l− 1)
i,j,k
u n ( l− 1)
i,j,k
=
+ ω ( Y
) .
We define the squared L 2 -norm of the residuum after the l th SOR iteration by
R ( l ) =
i,j,k ( a i,j,k u n ( l )
a i,j,k u n ( l )
a i,j,k u n ( l )
i,j,k
i− 1 ,j,k
i +1 ,j,k
a i,j,k u n ( l )
a i,j,k u n ( l )
i,j +1 ,k a i,j,k u n ( l )
a i,j,k u n ( l )
b i,j,k ) 2 .
i,j− 1 ,k
i,j,k− 1
i,j,k +1
The iterative process is stopped if R ( l ) < TOL R (0) . The relaxation parameter ω
is chosen by the user to improve the convergence rate of the method.
4. BUILDING UP THE PARALLEL ALGORITHM
4.1. MPI Programming
A parallel computer architecture (cf. [67]) is usually categorized by two as-
pects: whether the memory is physically centralized or distributed, and whether
or not the address space is shared. On one hand there is so-called SMP (symmet-
ric multi-processor) architecture that uses shared system resources, e.g., memory
and an input/output subsystem, equally accessible from all processors. On the
other hand, there is the MPP (massively parallel processors) architecture, where
the so-called nodes are connected by a high-speed network. Each node has its
own processor, memory, and input/output subsystem, and the operating system is
running on each node. Massively does not necessarily mean a large number of
nodes, so one can consider, e.g., a cluster of Linux system computers of a reason-
able size (and price) to solve a particular scientific or engineering problem. But,
of course, parallel computers with a huge number (hundreds) of nodes are used at
large computer centers.
The main goal of parallel programming is to utilize all available processors
and minimize the elapsed time of the program. In SMP architecture one can assign
the parallelization job to a compiler, which usually parallelizes some DO loops
(for image processing applications based on such an approach we refer the reader
to, e.g., [68]). This is a simple approach but is restricted to having such memory
Search WWH ::




Custom Search