drfremove@nber.org wrote:
> Joseph Ashwood wrote:
> > "drfremove@nber.org" <feenberg@gmail.com> wrote in message
> > news:1127996717.351192.56310@g14g2000cwa.googlegroups.com...
> > > The requirement is that the same SSN [and only the SSN] should encrypt to
> > > the same value [for later comparison]
> >
> > Let's begin by attempting to find the security requirements (I'll give you a
> > hint, any system that has to hold only the SSN like this will not be
> > secure). First the SSN is not anything approaching random, in fact the
> > Social Security Administration has a website on they are generated
> > (http://www.ssa.gov/history/ssn/geocard.html) so there is very little that
> > is unguessable, for example 602-12-3456 (apologies to whomever has this
> > number, it has been issued and I only guessed) is from California,
> > eliminating 3 digits immediately, and the 12 identifies that it is fairly
> > early in the usage, in fact while I don't have the timing information in
> > front of me 602-12-xxxx is probably mostly retired if not permanently so. So
> > the level of protection has to be extremely high in order to prevent the
> > leakage of the last group of numbers (serial number).
>
> The dataset does include state of residence, which is correlated with
> state of issuance of SSN. However, if I am looking for a particular
> person, knowing that they are from California allows me to exclude 80%
> of the records. Even if that knowledge also allows me to guess at part
> of the SSN, once I have done so it really only allows me to exclude the
> same 80%, so SSN structure isn't a source of weakness unless it helps
> the intruder solve for the remaining digits. With the right encryption
> technology, it shouldn't have that effect.
>
> >
> > The requirement of a 1-1 mapping leads to low levels of protection. Even
> > assuming all 9 digits are unknown that is less than 2^30 work, so a modern
> > computer can run through all combinations in a matter of minutes. This means
>
> Let me make it clear that the threat model is not that someone would
> take a record and solve for the SSN that goes with it, but that they
> would locate the record in the database that had a particular SSN of
> interest to them (an ex-spouse, for instance). With 100 million SSNs in
> the database, that may be much harder (again, depending on the method).
Why do you need to retain the original SSN? Can't you just renumber
the records starting at 1 and running through 999,999,999 as needed? or
use random numbers? If you need the original SSN for a newer-version
comparison of records from the issuing authority, you can create an
index when you renumber and keep that index secure, off-line.
karl m
Received on Sat Oct 15 04:37:52 2005