Cryptography Reference
In-Depth Information
5.5 Cryptanalysis of Hash Functions
One goal of hash functions is to generate a digest of a source that cannot easily give information about that
source (and ensure the one-wayness of the hash function). The other goal is to ensure that when an attacker
knows the source text and its corresponding hash, he or she cannot easily produce another text that produces the
same hash, especially for texts that are similar to the original source text.
Hence, the main goal of cryptanalysis of hash functions is to either obtain information about the original
source text, or to produce a duplicate hash. A pair of messages with hashes is called a collision . There are three
types of attacks to obtain collisions that we will be concerned with. In increasing order of difficulty, they are:
1. Finding two messages that produce the same hash, with no necessary link to any other particular item.
This is the “easiest” of the goals, since there are no constraints on the output digest. This is a standard
collision attack .
2. Finding another message that generates a target hash. The target hash would probably belong to some
important item. This is called a pre-image attack . (This term is taken from mathematical terminology,
where the hash would be called the image of the message, and the message is called the pre-image of the
hash.)
3. Finding another message that generates a target hash, with the original item being a plausible alternat-
ive to some target original item. This is an extension of the pre-image attack .
For example, if a Word document contains some instructions and a hash is calculated, someone might be
interested in swapping it with a Word document with different instructions, but have it produce the same
hash. This is the hardest collision to produce.
The simplest method on the first and simplest kind of collision is to simply keep choosing random data, cal-
culating the hash, and storing both. The birthday paradox says that we only need to wait and store about
entries (for a hash size of n) before we can expect to find a collision, and hence any better method should have
run time or storage less than
For the other methods, we do not have the birthday paradox to help us out; thus, the standard brute-force
wouldrequirearuntimeorstorageof n. Therefore,anytechnique havingaruntimeorstoragelessthan n would
be an improvement.
As shown above, checksums and CRCs are not very appropriate for digital signatures and other security pur-
poses. The current standard is to use the more sophisticated hashing algorithms, like MD5 and SHA-1 for these
purposes.
The intricate churning used in them makes it very difficult to map anything in the message digest to say any-
thing about the original message. Hence, for both MD5 and SHA-1, it is still a difficult task to find collisions
or any other kinds of weaknesses in these algorithms. The upper limit on time is governed by the birthday para-
dox; thus, for example, the 160-bit hash of SHA-1 would require a store of 2 80 message digests before a better-
than-50-percent chance of finding a collision. Therefore, any method to find collisions should ideally work in
less time than this.
Of course, because of the popularity of the two algorithms, there is a lot of attention given to them by crypt-
analysts.
Going into the full analysis of hash functions is beyond the scope of this topic. Many of the principles used
in Chapter 7 are applied to the hash algorithms, including tracing difference paths through the hash to generate
collisions
Search WWH ::




Custom Search