Re: Ancient history
Available news archives: comp.lang.tcl - comp.lang.python - comp.security.firewalls - sci.crypt - comp.lang.php - comp.lang.javascript
Google
 
Web news.hping.org


sci.crypt archive

Re: Ancient history

From: Nick Maclaren <nmm1@cus.cam.ac.uk>
Date: Sun Jun 12 2005 - 09:30:09 CEST

In article <fZCdnctIVOMw0zbfRVn-uw@comcast.com>,
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
>Nick Maclaren wrote:
>
>> All of that is true, and I have similar experiences, but the issue
>> here IS the abstraction. C's model of arrays and areas of data
>> is precisely that of a pointer to the start - with no length
>> included or includable.
>
>I always thought it was possible to do in C, but there doesn't seem
>much interest in actually doing it. If a pointer holds origin, current
>offset, and length then it is possible to do bounds checking even with
>pointer arithmetic.

No. As I have posted before, it ISN'T possible in C, for reasons I
explain in my Objects diatribe. There is more than is in that
document, too, including C99's introduction of intptr_t, but that
gives a good start.

The best that is possible in C is the following:

    A reference is checked that it refers to a correctly aligned
    segment of memory wholly contained within a single allocation
    (i.e. a definition, result of malloc etc.)

    A pointer value that is operated on using only the pointer
    operations defined in the standard (i.e. not intptr_t or any
    of the numerous permitted extensions) does not change a pointer
    into one allocation into a pointer into another.

You CANNOT block overflows from one subsection of an allocation
into another, in general, without changing the standard drastically.

Equally badly, where languages like Fortran 77 can have reliable
bounds checking for a small, constant overhead per reference,
the situation in C is such that the overhead is generally log(N)
with a larger constant, where N is the number of non-definition
allocations (often a million plus). Ouch.

Oh, and hardware support wouldn't help, unless it was designed entirely
for C's bizarre rules - which changed quite radically between K&R C,
C89 and C99. Think pointer validity and weep.

>I was interested in the possibility of a C compiler targeting JVM,
>which has those restrictions. There are some additional complications
>regarding struct and union, but for a large fraction of C code it should
>be possible.

It depends how you define "a large fraction". In particular, it
doesn't include any program that uses any form of the X Windowing
System, or any program which includes a fancy memory management
system or "dump/restore" system.

>> It is possible to build an abstraction on top of that which includes
>> both the pointer and its size, but that is a separate model. Many
>> of the more robust C programs do precisely that - which is precisely
>> using C as a semi-portable assembler, as it was designed for.
>
>In either case there is some overhead to actually doing it.
>That seems to be the biggest reason not to do it.

There is. My belief is that, in a language like Fortran, the overhead
could be as little as 20%. The analysis used to optimise code provides
exactly the right information needed to generate efficient checking.
Unfortunately, there are two obstacles to adding it to existing
compilers:

    Changes to the data structures (including argument passing) are
    needed. Not major ones, but enough to cause compatibility trouble.

    The domination of benchmarketing means that money spent on adding
    1% of performance is good, but money spent on adding reliability
    is bad.

However, please note that NAG Fortran 95 does indeed have such checking.
I don't know if any other Fortran 95 compilers do, but a lot of
Fortran 77 ones did.

Regards,
Nick Maclaren.
Received on Thu Sep 29 21:43:17 2005