Digital Signal Processing Reference
In-Depth Information
2.2
Public-Key Algorithms and Implementations
2.2.1
Efficient Modular Multiplication
Public-key cryptography (PKC), a concept introduced by Diffie and Hellman [ 18 ]
in the mid 1970s, has gained its popularity together with the rapid evolution of
today's digital communication systems. The best-known public-key cryptosystems
are based on factoring i.e. RSA [ 66 ] and on the discrete logarithm problem
in a large prime field (Diffie-Hellman, ElGamal, Schnorr, DSA) [ 53 ] oronan
elliptic/hyperelliptic curve (ECC/HECC) [ 42 , 43 , 56 ] . Based on the hardness of the
underlying mathematical problem, PKC usually deals with large numbers ranging
from a few hundreds to a few thousands of bits in size. Consequently, the efficient
implementations of the PKC primitives has always been a challenge.
An efficient implementation of the mentioned cryptosystems highly depends on
the efficient implementation of modular arithmetic. Namely, modular multiplication
forms the basis of modular exponentiation which is the core operation of the RSA
cryptosystem. It is also present in many other cryptographic algorithms including
those based on ECC and HECC. In particular, if one uses projective coordinates for
ECC/HECC, modular multiplication remains the most time consuming operation
for ECC. Hence, an efficient implementation of PKC relies on efficient modular
multiplication.
Two algorithms for modular multiplication, namely Barrett [ 8 ] and Montgomery
[ 57 ] algorithms are widely used today. Both algorithms avoid multiple-precision
divisions, the operation that is considered to be expensive, especially in hardware.
Implemented on a first generation of DSP produced by Texas Instruments (TMS
32010), the Barrett's algorithm is one of the famous examples in the area of signal
processing for cryptography. The properties of a DSP such as fast multiply and
accumulate (MAC) operation and a fast microprocessor on a single chip seemed to
be an ideal combination for the proposed algorithm. Furthermore, additional speed-
ups were obtained by taking advantage of a feature of the TMS320 DSP which
allowed auto increment and decrement of data pointers during MAC operations.
This ensured the data fetching for free. The original Barrett reduction algorithms is
given in Algorithm 1 .
Algorithm 1 Barrett modular reduction
Input: A
μ = 2 2 n
M .
=(
A 2 n 1
...
A 0
)
2 , M
=(
M n 1
...
M 0
)
2 where M n 1
=
0,
/
Output: R
=
A mod M .
A
2 n
1 μ
2 n + 1
Q
A mod 2 n + 1
QM mod 2 n + 1
R
while R
M do
R
R
M
end while
return R .
 
Search WWH ::




Custom Search