Cryptography Reference
In-Depth Information
5.6 Collision Resistant Hash Functions
We have already mentioned hash functions and the fact that a generic hash function
is one that maps strings of (almost) arbitrary length to short fixed-length strings
but, in this section, we are going to deal with hash functions specially designed for
cryptographic purposes. Imagine that we want to assign to each message a short
fingerprint or message digest that guarantees the message integrity so that if the
message is changed by someone or something then the corresponding fingerprint is
different and we detect the change simply by comparing it with the original one.
These message digests will be called hash values or simply hashes . In principle, for
this setup to work we need that the function
message
hash value
be injective because otherwise, any other message with the same image would be
deemed authentic by this procedure. Now, if we use for the hashes the same alphabet,
say
, as for the messages, then the injectivity of the function requires the use
of hashes that are, on average, as long as the messages themselves. This defeats the
purpose of using these hashes because we want to deal with messages of arbitrary
length—or, at least, with very long messages—and we want the hashes to be short
and easily manageable; if they have the same length as the messages they are not
useful as one would rather check the integrity of the message by direct inspection.
So, if we want to use short hashes we have to dispense with injectivity. But then, as
already mentioned, all preimages of a given hash value would be authenticated by
it, which is clearly undesirable and even more so in a cryptographic context (more
about that later). So it seems that we are in a deadlock ... or are we?
The solution is to consider functions h
{
0
,
1
}
n that, while far from
being injective, 1 look like that to any PPT adversary. The idea is then that it should
be computationally infeasible to show, in an effective way, that these functions are
not injective or, in other words, to find a collision, i.e., two different messages with
the same hash value.
This idea poses another problem. Here, in contrast with the preceding section
where we defined universal hash functions, we are dealing with a single, unkeyed
function. The idea that finding a collision should be hard is difficult to formalize for
a single function because for the given h collisions always exist, and so there is an
algorithm that outputs m and m such that h
}
:{
0
,
1
→{
0
,
1
}
m )
(it is another matter to find
these values in practice). In contrast, if keyed hash functions
(
m
) =
h
(
are considered,
this idea may easily be formalized in complexity-theoretic terms: we should simply
require that any probabilistic polynomial-time algorithm (or adversary) can find a
collision for h k , where k is randomly chosen, only with negligible probability. Note
that in this case there is no such collision-finding algorithm as before; now the
{
h k }
1 Since
} is an infinite set, there is an infinite number of messages which are mapped to the
same hash value and, even if in practice we restrict the domain of the function to some
{
0
,
1
m with
{
0
,
1
}
m
>
n , the function cannot be injective by the pigeonhole principle.
 
Search WWH ::




Custom Search