Cryptography Reference
In-Depth Information
generates only one bit of the result at each step. Its implementation in software is usu-
ally very slow due to many vector shifts. The large number of these operations renders
the implementation unfeasible, especially on small 8-bit processors where shifts can
only be performed one bit at a time. The shift-and-add method is more suitable for
hardware, where a vector shift can be performed in one clock cycle. Embedded proces-
sors require faster methods for field multiplication. The following sections present a
description of an optimized hierarchical technique for binary polynomial multiplica-
tion. This method combines different multiplication techniques and achieves superior
results on constrained WSN platforms.
9.8.2.2 Optimized Hierarchical Method for Polynomial Multiplication
The optimized hierarchical technique for binary polynomial multiplication consists
of two stages: the first stage is the application of the Karatsuba-Ofman technique,
which divides large polynomials into several smaller polynomials (Knuth 1997); the
second stage performs the actual multiplication of binary polynomials using the Comb
method (Lopez and Dahab 2000).
For embedded processors with small word sizes, the hierarchical multiplication
technique is attractive for large fields. In the case of smaller values of m , the second stage
of the technique can be applied directly in order to avoid unnecessary overhead. The
execution time of the whole method depends on the efficiency of the Comb multiplica-
tion. This algorithm should be implemented in assembly language to provide maxi-
mum performance on a given hardware platform. The multiplication of two binary
polynomials a ( z ) and b ( z ) of degree m- 1 results in a polynomial of degree at most 2 m- 2.
In the first step, the Karatsuba-Ofman method divides large m- 1 polynomials into
several polynomials with fewer coefficients. This divide-and-conquer method can be
directly adapted for binary polynomials. In the case of a division into two parts (depth
d = 2), the polynomial multiplication c ( z ) = a ( z ) b ( z ) can be represented as
n
=
+
az
()
A zx
()
A z
()
(9.1)
1
2
n
=
+
bz
()
B zz
()
B z
()
(9.2)
1
2
2
n
n
=
+ +
+ + +
+
cz
( )
ABz
((
A
A
)(
B
B
)
AB
AB z
)
AB
(9.3)
11
1
2
1
2
2 2
11
2 2
é ù
ê ê ú
n and A 1 ,A 2 ,B 1 ,B 2 are binary polynomials of degree less than n .
For d = 2, the above technique substitutes one full multiplication with three half-
size multiplications. One multiplication is saved at the cost of several extra additions
that are very fast in binary fields. The most time-consuming part of Eq. (9.3) is the
computation of the subproducts A 1 B 1 ,A 2 B 2 and ( A 1 + A 2 )( B 1 + B 2 ). In the second stage
of the hierarchical multiplication, these products are calculated using the optimized
Comb method with a window width of w -bits.
where
2
Search WWH ::




Custom Search