Cryptography Reference
In-Depth Information
time, while A can be hard-coded into the design by a VHDL preprocessor. This
saves hardware resources on the FPGA, reduces the complexity and eliminates
potentially critical paths.
The reference keystream can be hard-coded as well.
Early Abort: A cipher key can be considered invalid as one bit from the
generated keystream differs from the reference. In such a case, the comparison to
the reference keystream can be aborted early such that the unit can immediately
continue with the next key candidate.
The probability that k subsequent bits of the keystream are correct for a
wrong key is 2 −k . On average, the comparison for a wrong key already fails after
two keystream bits. Therefore, n
2 cycles can be saved in comparison to a
deterministic unit that always compares n bits.
We compare at most 32 bits and thus save 30 cycles on average.
Pre-ciphering Pipeline: With the Early Abort optimization, several DSC
units are competing to be loaded with a new initial state. As the arbitration
logic complexity rises with the number of competing units, this number is to be
kept low. A good way to do this is outsourcing the pre-ciphering phase into a
strictly sequential, deterministic pipeline. With this optimization, the state after
pre-ciphering
is directly loaded into the computing DSC units.
Input Buffering: Idle time of the FPGA has a negative impact on the effective
performance. Therefore, an input buffer is used such that the PC can enqueue
multiple tasks and the FPGA can immediately load the next task as soon as the
previous one is finished.
5.3
Implementation
For our implementation, a Xilinx Spartan-3E 1200 (XC3S1200E) FPGA on a
Digilent Nexys 2 board was used. The PC communication was implemented via
the on-board RS-232 interface.
Our final implementation includes all optimizations as described in section
5.2. The runtime of the design is not entirely deterministic, as - for a specific
keystream - the position of the first failing comparison is unknown. Therefore,
the key generator was given the ability to be paused, which is necessary when
all available DSC units are busy.
Figure 3 shows the structure of the key search unit, which forms the essential
part of our hardware design. The dotted lines in the diagram denote the hard-
coded data. The “State Offset” is sent to the FPGA at run time for each sub
key space. It is determined by the attacked IV and the vector b .
One pipelined key generator (see 5.2) was chosen to serve four DSC units -
this is the maximum number implementable on one Look-Up Table.
The key search unit consumes about 30% of the FPGA resources in total,
such that three instances can be created on our device. This enables searching
three sub key spaces at the same time.
 
Search WWH ::




Custom Search