Information Technology Reference
In-Depth Information
Since some a j can be either functions (composed of hyperbolic and exponential
operations) or constants, the approach is to compute first those a j with lower
CPI 8 and then, compute the functions with higher CPIs (for example, math
operations such as “
” have lower CPI than “ sinh ( x )” and
e x ”.) When a j = 0, the algorithm skips the product operator and computes
the next i -th term as shown in Fig. 4.
×
”, “+”, “-”, “
÷
Fig. 4. Flow diagram of the optimization of the number of operations
This method reduces unnecessary operations when there is a multiplication
by zero, thus decreasing the waste of valuable clock periods.
4.3 Use of Parallel Algorithms and Architectures
In order to reduce the processing time of complex computations, parallel tech-
niques can be applied. For a computational problem, the parallelization can be
achieved using compiler directives [6] or manually. The former approach is not
recommended in this particular problem because it yields lower performance in
complex problems like this since the parallelization process depends on many fac-
tors such as algorithm structure 9 , the parallel computer architecture 10 and the
parallel programming model. Since there is no general method for parallelization,
there are a series of steps described in [5] that were used for the parallelization
process.
Based on the flow diagram of Fig. 3, there are two types of partition schemes:
Loop Partitioning (called LP) on VX loop and Task Partitioning (called TP)
where the tasks (see Fig. 3) are parallelized. As mentioned previously, these
8 Acronym of Clocks Per Instruction.
9 Regarding the data and task dependencies in a code.
10 Architectures based on distributed memory, shared memory, number of processing
units and interconnection topology.
 
Search WWH ::




Custom Search