Cryptography Reference
In-Depth Information
plaintexts, one million characters, say, the histogram of ciphertext symbols will
be approximately flat, with each ciphertext symbol occurring approximately 1000
times.
As a result of this observation, frequency analysis of single ciphertext symbols
will be a waste of time. Such analysis will tell us nothing about the frequency of
the underlying plaintext letters.
PROBLEMS WITH HOMOPHONIC ENCODING
Homophonic encoding is designed precisely to counter single letter frequency
analysis. It does not automatically prevent frequency analysis of bigrams,
although our example of homophonic encoding will involve analysing one
million bigrams, which will require a substantial amount of ciphertext to be
obtained before being effective. However, there are twomore significant problems
with homophonic encoding that all but rule it out as a practical encryption
technique:
Key size . The key for a homophonic code consists of the specification of assigned
ciphertext symbols to plaintext letters. Very crudely (an accurate measurement
is beyond our mathematical ambitions), this involves storing a table that
contains a list of the ciphertext symbols assigned to each plaintext letter. Each
ciphertext symbol appears once in this table. If we do this on a computer then
we need to represent this table in binary. We note that each symbol froma set of
size 1000 can be represented by 10 bits when represented in binary (this follows
from the relationship between binary and decimal numbers, which is discussed
further in the Mathematics Appendix). So our key is, very approximately,
1000
10 000 bits long. By the standards of modern cryptosystems this
is very large (for example, AES has a key size of between 128 and 256 bits, as
discussed in Section 4.5).
Ciphertext expansion . We have extended our plaintext alphabet from 26 letters
to a ciphertext alphabet of 1000 symbols. This means that it takes more
information (think of information as the number of bits needed) to represent
the ciphertext than it takes to represent the plaintext. In other words, the
ciphertext is much bigger than the plaintext. Very roughly, we need:
×
10
=
• 5 bits to represent each of the 26 plaintext letters, but;
• 10 bits to represent each of the 1000 ciphertext symbols.
Hence each 5-bit plaintext letter will be encrypted to a 10-bit ciphertext
symbol. This increase in the size of the ciphertext is often referred to as
message expansion and is generally regarded as an undesirable property of a
cryptosystem because the ciphertext becomes more 'expensive' to send across
the communication channel.
Of course, we have based the above analysis on our example of a homophonic code
that uses 1000 ciphertext symbols. It is possible to design simpler, less effective,
homophonic codes that use fewer ciphertext symbols (another example of the
 
Search WWH ::




Custom Search