Signal Processing for Cryptography and Security Applications - Signal Processing Systems

Digital Signal Processing Reference

In-Depth Information

2.2

Public-Key Algorithms and Implementations

2.2.1

Efficient Modular Multiplication

Public-key cryptography (PKC), a concept introduced by Diffie and Hellman [ 18 ]

in the mid 1970s, has gained its popularity together with the rapid evolution of

today's digital communication systems. The best-known public-key cryptosystems

are based on factoring i.e. RSA [ 66 ] and on the discrete logarithm problem

in a large prime field (Diffie-Hellman, ElGamal, Schnorr, DSA) [ 53 ] oronan

elliptic/hyperelliptic curve (ECC/HECC) [ 42 , 43 , 56 ] . Based on the hardness of the

underlying mathematical problem, PKC usually deals with large numbers ranging

from a few hundreds to a few thousands of bits in size. Consequently, the efficient

implementations of the PKC primitives has always been a challenge.

An efficient implementation of the mentioned cryptosystems highly depends on

the efficient implementation of modular arithmetic. Namely, modular multiplication

forms the basis of modular exponentiation which is the core operation of the RSA

cryptosystem. It is also present in many other cryptographic algorithms including

those based on ECC and HECC. In particular, if one uses projective coordinates for

ECC/HECC, modular multiplication remains the most time consuming operation

for ECC. Hence, an efficient implementation of PKC relies on efficient modular

multiplication.

Two algorithms for modular multiplication, namely Barrett [ 8 ] and Montgomery

[ 57 ] algorithms are widely used today. Both algorithms avoid multiple-precision

divisions, the operation that is considered to be expensive, especially in hardware.

Implemented on a first generation of DSP produced by Texas Instruments (TMS

32010), the Barrett's algorithm is one of the famous examples in the area of signal

processing for cryptography. The properties of a DSP such as fast multiply and

accumulate (MAC) operation and a fast microprocessor on a single chip seemed to

be an ideal combination for the proposed algorithm. Furthermore, additional speed-

ups were obtained by taking advantage of a feature of the TMS320 DSP which

allowed auto increment and decrement of data pointers during MAC operations.

This ensured the data fetching for free. The original Barrett reduction algorithms is

given in Algorithm 1 .

Algorithm 1 Barrett modular reduction

Input: A

μ = 2 2 n

M .

=(

A 2 n − 1

...

A 0

)

2 , M

=(

M n − 1

...

M 0

)

2 where M n − 1

=

0,

/

Output: R

=

A mod M .

A

2 n

1 μ

2 n + 1

Q

−

⇐

A mod 2 n + 1

QM mod 2 n + 1

R

⇐

−

while R

≥

M do

R

⇐

R

−

M

end while

return R .

Signal Processing Systems

Search WWH ::

Custom Search

Home