Re: Added hashes - LONG
Available news archives: comp.lang.tcl - comp.lang.python - comp.security.firewalls - sci.crypt - comp.lang.php - comp.lang.javascript
Google
 
Web news.hping.org


sci.crypt archive

Re: Added hashes - LONG

From: John A. Malley <102667.2235@compuserve.com>
Date: Wed Dec 28 2005 - 22:42:49 CET

David Wagner wrote:
[...]
>
>
> Unfortunately, I suspect this is probably too weak. It only provides
> any kind of security if the adversary never gets to see any hash input;
> otherwise, all bets are off. But protocols in the real world often let
> the adversary see the input to the hash, so even if the hashes meet this
> condition, that tells us nothing about what will happen when we use the
> hashes with a protocol that lets the adversary ever see any part of any
> hash input (or even worse, choose a part of it).

This is an interesting argument, and it prompted me to re-read "Cryptographic
Hash-Function Basics: Definitions, Implications, and Separations for Preimage
Resistance, Second-Preimage Resistance, and Collision Resistance," P. Rogaway
and T. Shrimpton, FSE 2004 (and on the ICAR ePrint server.)

Preimage resistance, second-preimage resistance and collision resistance are the
three basic notions describing cryptographic hash function security. Rogaway and
Shrimpton methodically refine the three notions into seven security measures.
It's a reasonable hypothesis that any notion of cryptographic hash functions
"independence" should be expressible in terms of those seven security measures.
I'll use their notion in this post.

Let's start with k >= 2 hash functions that each take {0,1}^* -> {0,1}^n. I'll
make the claim that we can always fashion a keyed hash function family by
indexing them by k in a table, K is the set of all keys. Let Y ={0,1}^n, the set
of all output values from the hash. Rogaway and Shrimpton define security
characteristics with respect to messages M in {0,1}^m, a subset of {0,1}^*.

(All adversaries show may be probabilistic algorithms.)

"Preimage resistance," (Pre), is defined in the paper as

     Pre[m]
Adv (A) = Pr[ k <-r- K; M<-r-{0,1}^m; y<-H_k(M); M'<-A(k,y): H_k(M')= y ]
     H

and the Adversary knows everything but the preimage M.

"everywhere Preimage-resistance," (ePre), is defined as

     ePre[m]
Adv (A) = max {Pr[ k <-r- K; M<-r-{0,1}^m; M'<-A(k): H_k(M')= y ]}
     H y in Y

Here the Adversary gets to pick the output y in the range of H_k() that's most
likely to find the preimage.

"always Preimage-resistance," (aPre), is defined as

     aPre[m]
Adv (A) = max {Pr[ M<-r-{0,1}^m; y<-H_k(M); M'<-A(y): H_k(M')= y ]}
     H k in K

Here the Adversary gets to pick the hash function (k, thus H_k()) in the hash
function family for which he's most likely to find preimages.

Now as I understand the paper:

For the least possible Adv-Pre[m], the probability of the Adversary finding the
random challenge preimage of y and a random challenge H_k() should be 2^(-m).

For the least possible Adv-ePre[m], the probability of the Adversary finding the
preimage of any y the Adversary desires, given a random challenge H_k(), should
be 2^(-m). No y should be easier to invert than any other for any given hash
function H_k().

For the least possible Adv-aPre[m], the probability of the Adversary finding the
random challenge preimage of y no matter what instance k (and thus H_k() out of
H) the Adversary chooses should be 2^(-m). No hash function instance in the
family should be easier to invert when given a random challenge y than any other
instance.

Now I assume from the definition of aPre[m] advantage that the Adversary gets to
pick a key (and thus a particular hash instance) and is given the hashed output
y of a uniformly random selected challenge message M. What's not clear to me
from reading the paper is how the Adversary goes about finding the key that
maximizes the probability of success - the work factor to do so.

Depending on what I assumed here when reading the paper, the following analysis
may be new, or it may be implied by the max over all K, and it's just restating
what the authors intended (but I don't think they meant this when writing their
paper.)

What if we allow the Adversary to specify as many hash instances as he wants,
using a subset K' keys out of K, and to get the hashed output from each in
response to the same uniformly random selected challenge message M? Can the
Adversary tell what M is with non-negligible probability?

(Of course there's a resource limit to the number of keys tried, and the storage
for the outputs.)

Something like:

     aPre[m]-IND
Adv (A) = max {Pr[ M<-r-{0,1}^m; over K', y_i<-H_i(M); M'<-A(Y'): H_k(M')= y ]}
     H,K' k out of K'

and Y' is the set of hash outputs y_i, |Y'| = |K'|, and 1 <= i <= |K'|.

Now the Adversary measures success as his ability to invert any one of those
selected hashes when all of them get the same unknown input.

The Adversary picks a subset of hash function instances out of the hash family
(K' out of K,) and uses their combined information to find the preimage common
to all for the one instance in K', k, that maximizes the probability of inversion.

Ideally, for the least possible Adv^aPre[m]-IND, when the Adversary gets the
outputs from |K'| different hash functions in the keyed hash function family H
applied to the same secret uniformly random challenge M, the probability of the
Adversary inverting any one of those |K'| outputs should be 2^(-m), for any k in
K'. No hash function instance in the family should be easier to invert when
given a random challenge y AND the outputs from other hash function family
members in response to the same input, than any other instance.

This is keyed hash function family member "independence" with respect to
inverting preimages of bit length m, Adv^aPre[m]-IND, and I conjecture (but need
to show...)

     aPre[m]-IND aPre[m]
Adv (A) >= Adv (A)
     H,K' H

with both equal to 2^(-m) for an ideal keyed hash function family.

Similarly, for second-preimage resistance:

"Second-preimage resistance," (Sec), is defined in the paper as

     Sec[m]
Adv (A) = Pr[ k<-r- K; M<-r-{0,1}^m; M'<-A(k,M):(M!=M')^( H_k(M)= H_k(M')]
     H

and the Adversary knows everything (i.e. k, M.) Every hash function should map
2^m input strings uniformly at random into one of 2^n "bins" or output strings.

"everywhere Second-preimage resistance," (eSec), is defined as

     eSec[m]
Adv (A) = max {Pr[ k <-r-K; M'<-A(k):(M!=M')^( H_k(M)= H_k(M')]}
     H M in {0,1}^m

Here the Adversary gets to pick the m bit message M for which he's most likely
to find a second-preimage for a challenge hash function chosen uniformly at
random from the hash function family.

"always Second-preimage resistance," (aSec), is defined as

     aSec[m]
Adv (A) = max {Pr[ M<-r-{0,1}^m; M'<-A(M):(M!=M')^( H_k(M)= H_k(M')]}
     H k in K

Here the Adversary gets to pick the hash function (from k, and thus H_k()) in
the hash function family for which he's most likely to find a second preimage
for a uniformly random selected challenge message M.

Using the same idea as before, allow the Adversary to specify as many hash
instances as he wants using a subset K' keys out of K, and to get the hashed
output from each in response to the same uniformly random selected challenge
message M. Can the Adversary use this information from other hash function
family members to find a second preimage for a key k in K'? Something like:

     aSec[m]-IND
Adv (A) = max {Pr[M<-r-{0,1}^m; over K',y_i<-H_i(M); M'<-A(Y'):
     H,K' k out of K' (M!=M')^( H_k(M)= H_k(M')]}
                       

and Y' is the set of hash outputs y_i, |Y'| = |K'|, and 1 <= i <= |K'|.

The Adversary picks a subset of hash function instances out of the hash family
(K' out of K,) and uses their combined information to find a second preimage for
one instance in K', k, that maximizes the probability of finding a second preimage.

Ideally, for the least possible Adv^aSec[m]-IND, when the Adversary gets the
outputs from |K'| different hash functions applied to the same secret uniformly
random challenge M , the probability of the Adversary finding a second preimage
M ' for any one of those |K'| outputs should be 2^(n-m), no matter what k in K'.
It should be no easier to find a second preimage for a hash function instance in
the family when given a random challenge M AND the outputs from other hash
function family members in response to the same input, than for any other instance.

This is keyed hash function family member "independence" with respect to
second preimages of bit length m, Adv^aSec[m]-IND, and I conjecture (but need to
show...)

     aSec[m]-IND aSec[m]
Adv (A) >= Adv (A)
     H,K' H

with both equal to 2^(n-m) for an ideal keyed hash function family.

-----

Does this look like a way to put independence of cryptographic hash functions
into a more formal (and concrete) setting? I could use feedback from the group,
especially about the assumptions I made when reading Rogaway and Shrimpton's
paper and the definitions of aSec and aPre.

John A. Malley
102667.2235@compuserve.com
Received on Tue Jan 3 03:41:41 2006