Information Technology Reference
In-Depth Information
u `C1
i;j;k
D ˛ u i;j;k
C ˇ u i 1;j;k
C u i;j1;k
C u i;j;k1
C u i C1;j;k
C u i;jC1;k
C u i;j;kC1
(10.3)
for 1 i; j; k n 1 ,where ˛ D 1 6t= h 2 and ˇ D t= h 2 .
The above formula requires eight floating-point operations per inner grid point:
two multiplications and six additions. That is, the total number of floating-point
operations per time step is 8.n 1/ 3 . Recall from Sect. 7.4.5 that explicit numerical
schemes often have a strict restriction on the maximum time step size, which is
1
6 h 2
t
for this particular 3D case. The minimum number of time steps N needed for solving
( 10.1 ) between t D 0 and t D 1 is consequently N h 2 D 6n 2 . Therefore, the
total number of floating-point operations for the entire computation is
6n 2 8.n 1/ 3 D 48n 2 .n 1/ 3 48n 5 :
If we have n D 1;000 , then the entire computation requires 48 10 15 floating-
point operations. How much CPU time does it need to carry out these operations
on a serial computer? Let us assume that an extremely fast serial computer has a
peak performance of 48 GFLOPS, i.e., 48 10 9 FLOPS; then the total compu-
tation will require 10 6 s, i.e., 278 h. This may not sound like an alarmingly long
time. However, the sustainable performance of numerical schemes of type ( 10.3 ),
which are computer-memory intensive, is normally far below the theoretical peak
performance. This is due to the increasing gap between the processor speed and
memory speed on modern microprocessors, commonly referred to as the “memory
wall” problem [21]. Moreover, our simple model equation ( 10.1 ) has not considered
variable coefficients, difficult boundary conditions, or source terms. Therefore, it
is fair to say that a realistic 3D diffusion problem can require a lot more than the
above theoretical CPU usage, making a serial computer totally unfit for the explicit
scheme to work on a 1;000 1;000 1;000 mesh.
As another consideration, numerical simulators are frequently used as an exper-
imental tool. Many different runs of the same simulator are typically needed,
requiring the computing time of each simulation to be within e.g. an hour, or ideally
minutes.
It should be mentioned that there exist more computationally efficient methods
for solving ( 10.1 ) than the above explicit scheme. For example, a numerical method
with no stability constraint can use dramatically fewer time steps, but with much
more work per step. Nevertheless, the above simple example suffices to show that
serial computers clearly have a limit in computing speed. The bad news is that the
speed of a single CPU core is not expected to grow anymore in the future. Also, as
 
Search WWH ::




Custom Search