Cryptography Reference
In-Depth Information
4.2 Implementation of Multiplier, Partial Inverter and LSEs Solver
Our multipliers and partial inverter can execute a multiplication and partial
multiplicative inversion over
GF
(2
8
) within one clock cycle respectively. As
mentioned in Section 3.5, the critical path of each iteration of optimized Gauss-
Jordan elimination includes two multiplications and one addition. Since there ex-
ist some overlaps in two serial multiplications, one iteration of optimized Gauss-
Jordan elimination can be computed in 20
ns
with one clock cycle. Therefore, it
takes 12 clock cycles to solve a system of linear equations of matrix size 12
×
12,
which is 240
ns
with a frequency of 50 MHz.
Tabl e 5.
FPGA Implementations of the Multiplier, Partial Inverter and Optimized
Gauss-Jordan Elimination over
GF
(2
8
)
Components
Multiplier Partial inverter Gauss-Jordan elimination
Combinational
ALUTs
37
22
21718
Dedicated logic registers
0
0
1644
Clock cycles
1
1
12
Running time (
ns
)
10.768
9.701
240
Table 5 is extracted after place and route of multiplication, partial multi-
plicative inversion and optimized Gauss-Jordan elimination over
GF
(2
8
). Three
different kinds of cells included in our proposed architecture have been described
and their resource consumptions are given in Table 6.
Tabl e 6.
The Resource Consumptions for Each Cell in the Proposed Architecture for
Solving System of Linear Equations
Cell
Use
Two-input multiplier Three-input multiplier Adder
I
cell Partial inversion
0
2
0
N
cell
Normalization
1
1
0
E
cell
Elimination
0
2
1
4.3 Implementation of Transformations and Polynomial Evaluations
Theanetransformations
L
1
−
1
and
L
2
−
1
invoke vector addition and matrix-
vector multiplication over
GF
(2
8
). Table 7 shows that two ane transformations
take 18 clock cycles, which is 360
ns
with a frequency of 50 MHz, where the sec-
ond and fourth columns are the performance of vector additions using
L
1
offset
and
L
2
offset respectively and the third and fifth columns are the performance of
matrix-vector multiplications using the matrixes of
L
1
−
1
and
L
2
−
1
respectively.
Table 8 illustrates that polynomial evaluations takes 156 clock cycles, which is
3120
ns
with a frequency of 50 MHz, where the second, third and fourth columns
are the performances of components of multivariate polynomials, respectively.