High-Speed Hardware Implementation of Rainbow Signature on FPGAs - Post-Quantum Cryptography

Cryptography Reference

In-Depth Information

4.2 Implementation of Multiplier, Partial Inverter and LSEs Solver

Our multipliers and partial inverter can execute a multiplication and partial

multiplicative inversion over GF (2 8 ) within one clock cycle respectively. As

mentioned in Section 3.5, the critical path of each iteration of optimized Gauss-

Jordan elimination includes two multiplications and one addition. Since there ex-

ist some overlaps in two serial multiplications, one iteration of optimized Gauss-

Jordan elimination can be computed in 20 ns with one clock cycle. Therefore, it

takes 12 clock cycles to solve a system of linear equations of matrix size 12

×

12,

which is 240 ns with a frequency of 50 MHz.

Tabl e 5. FPGA Implementations of the Multiplier, Partial Inverter and Optimized

Gauss-Jordan Elimination over GF (2 8 )

Components

Multiplier Partial inverter Gauss-Jordan elimination

Combinational ALUTs

37

22

21718

Dedicated logic registers

0

1644

Clock cycles

1

12

Running time ( ns )

10.768

9.701

240

Table 5 is extracted after place and route of multiplication, partial multi-

plicative inversion and optimized Gauss-Jordan elimination over GF (2 8 ). Three

different kinds of cells included in our proposed architecture have been described

and their resource consumptions are given in Table 6.

Tabl e 6. The Resource Consumptions for Each Cell in the Proposed Architecture for

Solving System of Linear Equations

Cell

Use

Two-input multiplier Three-input multiplier Adder

I cell Partial inversion

0

2

0

N cell

Normalization

1

0

E cell

Elimination

0

2

1

4.3 Implementation of Transformations and Polynomial Evaluations

Theanetransformations L 1 − 1 and L 2 − 1 invoke vector addition and matrix-

vector multiplication over GF (2 8 ). Table 7 shows that two ane transformations

take 18 clock cycles, which is 360 ns with a frequency of 50 MHz, where the sec-

ond and fourth columns are the performance of vector additions using L 1 offset

and L 2 offset respectively and the third and fifth columns are the performance of

matrix-vector multiplications using the matrixes of L 1 − 1 and L 2 − 1 respectively.

Table 8 illustrates that polynomial evaluations takes 156 clock cycles, which is

3120 ns with a frequency of 50 MHz, where the second, third and fourth columns

are the performances of components of multivariate polynomials, respectively.

Post-Quantum Cryptography

Search WWH ::

Custom Search

Home