Information Technology Reference
In-Depth Information
language. Instead, interpreted languages often offer alternative programming tech-
niques for achieving efficiency in computations with large data structures or long
loops.
Basic Ideas of Vectorization
Algorithms with long for loops in Matlab and Python should be
vectorized
. Vec-
torization means expressing mathematical operations in terms of vector operations
instead of loops with element-by-element computations. A simple example may be
summing up the square of the elements of a vector
.
v
1
;:::;
v
N
/
:
s D 0
for
i D 0;:::;N
s s C
v
i
end for
In Matlab this translates to
s=0;
for i = 1:N
s = s + v(i)
*
v(i);
end
Similar Python code reads
s=0
for i in range(N):
s = s + v[i]
*
v[i]
These loops normally run slowly. To vectorize the code, we need to express the
loops in terms of vector operations. This is very simple in the present case, because
s
is nothing but the square of the norm of
v
.
Matlab has a function
norm
that we can use:
s = norm(v); s = s
*
s
Python has a special package, Numerical Python, for efficient array storage and
computation. This module does not provide a norm function, but the norm of
v
is
the square root of the inner product (also called the scalar product) of
v
and itself.
Since Numerical Python offers an
innerproduct
function, we can compute
s
as
s = innerproduct(v,v)
The major difference between computing
s
through a for loop and through a
norm
or
innerproduct
function is that the latter functions are implemented as a for loop
in C or Fortran
. Therefore,
norm
in Matlab and
innerproduct
in Python run as fast
as highly optimized C or Fortran code.
The idea of vectorization is to avoid explicit loops and instead invoke C or Fortran
functions where the loop can be much more efficiently implemented. However,
vectorizing an algorithm can require quite some rewriting of a loop-based algo-
rithm. The rewriting also depends on what vector operations are offered by the