Cryptography Reference
In-Depth Information
solutions, thus increasing the overall number of clock cycles required to complete
the encryption process. The smallest architectures proposed in literature are based on
eight-bit data paths and contain only one S-Box. Very compact solutions for ASIC
implementations are proposed in [137, 138, 174], while solutions targeting FPGA
implementations are proposed in [135, 166].
Scalable solutions with varying data path widths are proposed in [267, 357]. In
all these architectures, the designer can trade off performance for design cost by
selecting the number of parallel S-Boxes that are implemented. The smallest version
they propose is a 32-bit architecture. An optimized 32-bit architecture targeting
FPGA implementations is proposed in [88], where round keys are precomputed and
S-Boxes are optimized using lookup tables.
Concerning the hardware implementation of the round functions, first it must
be noted that no optimizations can be performed on ShiftRows and AddRoundKey
transformations, since no logic gates are needed for the former transformation and
only one XOR layer is needed for the latter. However, several optimizations can be
found for S-Boxes and MixColumns.
The MixColumns function can be implemented either in a straightforward manner
by instancing all the XOR gates required to perform the matrix multiplication, or
in an iterated (and smallest) way, where the overall multiplication is completed in
four clock cycles (one for each byte). However, several approaches are proposed for
S-Boxes:
1. ROM-based [433].
2. Truth table implementation (combinational logic): the S-Box is written in any
hardware description language as a truth table and it is converted into a gate-level
circuit using a logic synthesizer [119]. This solution allows the best performance
in terms of speed and dynamic power consumption.
3. Lookup Tables, suited for FPGA implementations [88, 135, 175, 328];
4. Mathematical implementation: the arithmetic operation on
2 8
F (
)
of the SubBytes
2 4
2
function is mapped to the isomorphic field
. This alternative can provide
better solutions in terms of area occupation and power consumption, at the cost
of a slower solution [351, 421, 433].
F ((
)
)
To provide an idea of the differences between truth table-based implementations
and mathematical implementations, we implemented the solutions proposed in [119]
(truth table) and [421] (mathematical) in VHDL and synthesized the two circuits.
Table 6.1 shows the comparison of area, speed and power consumption obtained
using a 90 nm technology library provided by ST Microelectronics.
A compact solution is proposed in [357] for implementation of both encryption and
decryption in the same device. Encryption and decryption hardware are merged by
sharing the AddRoundKey XOR gates, GF(2 8 ) inverters in S-Boxes, and common
terms between the permutation functions MixColumns and inverse MixColumns.
This optimization focuses on applications that do not require duplex modes where
encryption and decryption are performed at the same time.
Search WWH ::




Custom Search