Re: Added hashes.
Available news archives: comp.lang.tcl - comp.lang.python - comp.security.firewalls - sci.crypt - comp.lang.php - comp.lang.javascript
Google
 
Web news.hping.org


sci.crypt archive

Re: Added hashes.

From: John A. Malley <johnmrcswa@earthlink.net>
Date: Thu Dec 15 2005 - 07:54:48 CET

David Wagner wrote:
> John A. Malley wrote:
>
>>Two hash functions are independent [...] if the pair cannot be
>>distinguished from a pair of random oracles, [...]
>
>
> Yeah. In a very loose sense, I think that's about the right intuition.
> However, the problem is that when you try to make it precise, everything
> falls apart and the obvious formalization gives you useless results.
> Actually, the same problem occurs even if you want to talk about just a
> single hash function.
>
> Here, let me illustrate. I have found a way to distinguish SHA1 from a
> random oracle.

[...]

And it's a good one! I'm obviously barking up the wrong tree here, thanks for
that example.

>
> I don't know of any way to formalize the assumptions that two hashes F,G
> are independent. Here is the closest I know how to get:
>
> Let P be a protocol that uses the two hashes F,G. Let P* be an
> idealized version that replaces F,G with two independent random
> oracles. Assumption: Any attack on P* can be translated into a
> corresponding attack on P.
>
> However, that's not a very satisfying assumption. For one thing, it
> amounts to assuming what we want to prove. For another, it makes the
> independence of F,G depend on how they are used. You might have one
> protocol for which F,G can be safely modeled as independent, and another
> where they can't.
>
> I think the way to think about this is that, if you start from the
> "assumption" that F,G are independent, then modeling them as a pair of
> independent random oracles is a useful way to use that "assumption" to
> prove things about protocols that use F,G. However, this doesn't give
> you any way to check whether the "assumption" was true in the first place
> or not

>
> This seems analogous to the issues with the random oracle model.

[...]

>
> Bottom line: unkeyed hashes are hard to model in a rigorous framework.

Yes, I agree.

You, Mike Amling and Paul Rubin got me thinking about a different approach to
modeling "independence." Your example distinguisher example shows how easy it
is for the Adversary to distinguish a pair of unkeyed hashes from a pair of
random oracles when the Adversary can choose the inputs. The experiment offers
no insight into what's meant by "independent" hashes.

What if we "turn" the situation around, so the Adversary is not providing inputs
to the hash functions, but only gets outputs, like this:

There are two worlds, world 0 consists of two random oracles returning strings
from {0,1}^n, and world 1 consists of two nonkeyed hash functions G, H that both
map {0,1}^* -> {0,1}^n. The Adversary is given oracle access to the outputs of
the random oracles, and the outputs of the two nonkeyed hash functions. Each
time the Adversary "invokes" the oracle access, he gets a pair of outputs.

Now, in world 1, each time the oracle access is invoked, a random input for the
nonkeyed hash functions is selected at random from a set of inputs, and that
value is fed into both hash functions.

The game is played by flipping a coin, heads the Adversary gets world 0, tails
the Adversary gets world 1.

The Adversary asks for as many output pairs via oracle access as he can given
his computational resources, and tries to determine with non-negligible
probability which world he's in.

Now here's how independence comes into the picture. I'll try a specific example.

Let G be a hash function with preimage resistance, so it's computationally
infeasible to find any input x that hashes to to a given output y1 = G(x).
Given y1, an Adversary doesn't know anything more about the input x that
resulted in y1 than the fact that x is in the domain of G.

Let H(x) = (G(x) + K ) mod 2^n.

Given y2 = H(x), an Adversary still doesn't know anything more about the input x
that resulted in y2 than the fact that x is in the domain of H.

Now use G and H as the hash functions in the experiment. In world 1, the
Adversary gets y1 and y2, where

x <- R - {0,1}^*
y1 = G(x)
y2 = H(x) = (G(x) + K) mod 2^n

The Adversary can make multiple queries in world 1, get the set

<y1(1), y2(1)>, <y1(2), y2(2)>

and if he checks, discovers

  (y2(1) - y1(1))mod 2^n = ( y2(2) - y1(2)) mod 2^n

or in general, for any i != j,

(y2(i) - y1(i))mod 2^n = ( y2(j) - y1(j)) mod 2^n

which means he can tell that the same input is going into both hash functions,
and thus, this is world 1. This situation would hold with negligible probability
in world 0.

OK, a silly simple example, but - I think this points in this direction:

Two hash functions, both with preimage resistance, are independent if when given
the outputs of both in response to the same input, leak no information about
that input. All the Adversary knows is what he already knew, there's an input in
the domain of both of those functions that results in the pair G(x), H(x).

Two hash functions with preimage resistance are not independent if G(x) and H(x)
leak the fact (with non-negligible probability) that the same input x got
applied to both when it was applied to both. (Hm. This still is not sitting well
with me. But I think it's a step in the right direction.)

I'm thinking about how the leaking of information about the input (that it was
the same input applied to both hash functions and not the output of two random
oracles) relates to combining hash functions. At face value, it might weaken the
preimage resistance, the combination's preimage resistance is less than expected
when using one or the other hash.

Feedback appreciated,

John A. Malley
102667.2235@compuserve.com

-- 
"Support Our 'OOPS!"
- George, Condi, Rummy and Dick
Received on Fri Dec 23 20:10:21 2005