Cryptography Reference
In-Depth Information
message digest algorithms. All were created by Dr. Ron Rivest, who was also
one-third of the RSA team.
Understanding MD5
The goal of MD5, specifi ed in RFC 1321, or any secure hashing algorithm, is to
reduce an arbitrarily sized input into an n -bit hash in such a way that it is very
unlikely that two messages, regardless of length or content, produce identi-
cal hashes — that is, collide — and that it is impossible to specifi cally reverse
engineer such a collision. For MD5, n
128 bits. This means that there are 2 128
possible MD5 hashes. Although the input space is vastly larger than this, 2 128
makes it highly unlikely that two messages will share the same MD5 hash. More
importantly, it should be impossible, assuming that MD5 hashes are evenly,
randomly distributed, for an attacker to compute a useful message that collides
with another by way of brute force.
MD5 operates on 512-bit (64-byte) blocks of input. Each block is reduced to a
128-bit (16-byte) hash. Obviously, with such a 4:1 ratio of input blocks to output
blocks, there will be at least a one in four chance of a collision. The challenge
that MD5's designer faced is making it diffi cult or impossible to work backward
to fi nd one.
If the message to be hashed is greater than 512 bits, each 512-bit block is hashed
independently and the hashes are added together, being allowed to overfl ow, and
the result is the fi nal sum. This obviously creates more potential for collisions.
Unlike cryptographic algorithms, though, message digests do not have to be
reversible — in fact, this irreversibility is the whole point. Therefore, algorithm
designers do not have to be nearly as cautious with the number and type of
operations they apply to the input. The more operations, in fact, the better; this
is because operations make it more diffi cult for an attacker to work backward
from a hash to a message. MD5 applies 64 transformations to each input block.
It fi rst splits the input into 16 32-bit chunks, and the current hash into four
32-bit chunks referred to tersely as A, B, C, and D in the specifi cation. Most of
the operations are done on A, B, C, and D, which are subsequently added to
the input. The 64 operations themselves consist of 16 repetitions of the four bit
fl ipping functions F, G, H, and I as shown in Listing 4-2.
Listing 4-2: “md5.c” bit manipulation routines
unsigned int F( unsigned int x, unsigned int y, unsigned int z )
{
return ( x & y ) | ( ~x & z );
}
unsigned int G( unsigned int x, unsigned int y, unsigned int z )
{
return ( x & z ) | ( y & ~z );
}
 
Search WWH ::




Custom Search