From: MitchAlsup on
On Jun 25, 12:52 am, Terje Mathisen <"terje.mathisen at tmsw.no">
wrote:
> MitchAlsup wrote:
> > But back on topic: Can anyone show me a piece of code that depends on
> > integer overflow to create a buffer overrun? (Note: not integer
> > underflow)
>
> That seems almost trivial:
>
> An integer overflow turns a too-large positive result into a negative
> value instead, right?
>
> In that case buffer[a+b] turns out to really address buffer[-8] which is
> where the virtual method pointer (or some other interesting data item)
> happens to be located...

But why is this an integer overflow and not an integer underflow?

That is:: most arrays are from 0:max-1 or 1:max.
The most common form of buffer overrun is buffer[max+small_int]
The second most common form is buffer[-small_int]
Both of these are detected by bounds checking
Neighter of these is detected by integer(32-bit) overflow or
underflow.
Nor is the second form detected by unsigned underflow {where
small_positive becomes small_negative}
Additionally, bounds checking is one of those algorithms that one can
teach the optimizer of a compiler to recognize and minimize the
overhead possible down into the noise floor without loosing any safety/
robustness.

Which leads me to a vastly better question:

Why do application programmers use int (integers) to index arrays?
Should not these actually be unsigned int? It seems to me that one
should have to justify using a signed value by the necessity of
needing negative value at some time durring its computation.

Mitch
From: Andy 'Krazy' Glew on
On 6/25/2010 12:44 AM, Terje Mathisen wrote:
> Andy 'Krazy' Glew wrote:
>> Overflow ::
>>
>> a >= 0 && b >= 0 ::
>>
>> a+b >= a ==> no overflow
>> a+b < a ==> overflow
>>
>> a < 0 && b >= 0 ::
>> a >= 0 && b < 0 ::
>> no overflow
>> (although you can have underflow,
>> negative overflow, which is handled
>> similarly)
>
> No, you cannot get that: Opposite signs are totally safe.

Sorry, nope. You're thinking in terms of C, specifically C signed.

Think about unsigned + signed.

Consider

uint8_t image_pixel = 1;
signed int8_t delta = -4;
image_pixel += delta;

It took me while to realize this in MMX. Images are usually unsigned, and are often 8 bits. But image differences are
signed. They want to be extended precision, for example nine bits rather than eight bits, but for the usual reasons we
may want to do saturating arithmetic and stay in 8 bits (like, if we widen to 16 bits we cut performance in half).

I was trying to come up with a macro that would handle this with the test
image_pixel > 0 // by definition, because it is unsigned
&& delta < 0
==> possible negative overflow

without having to make code that was specific to to various combinations
template<T1,T2> bool overflow(v1,v2);
template<T1,T2> bool overflow<unsigned T1, unsigned T2>(v1,v2);
template<T1,T2> bool overflow<signed T1, signed T2>(v1,v2);
template<T1,T2> bool overflow<unsigned T1, signed T2>(v1,v2);
template<T1,T2> bool overflow<signed T1, unsigned T2>(v1,v2);

if for no other reason that I can never remember the syntax for partial specialization.

And because if I want to do it in C rather than C++, I don't get to use partial specialization.



> sum = a+b;
> possible_overflow = (a ^ b) > 0; // I.e. same sign
> overflow = possible_overflow &&
> ((sum ^ a) < 0); // Sum of sign have changed!
>
> Alternatively:
>
> sum = a+b;
> no_overflow = ((a ^ b) < 0) || ((sum ^ a) > 0);
>
> or:
>
> sum = a+b;
> no_overflow = ((a ^ b) | (~sum ^ a)) < 0;


Okay, now how about the unsgned+unsigned and unsigned+signed cases?

u+u is easier, because C defines that to be modulo arithmetic.

u+s ... it seems that we have to have type specific code. Although I am tempted to do a dirty trick, since
unsigned+signed is signed, right?


> The next stage is to convert this flag into a properly saturated result:

I think I'll stop at a bool. Compilers already have a lot of tricks to smear or saturate. I just want an idiom for
overflow detection that a compiler can recognizse and convert into more efficient code not expressible in C.

From: Andy 'Krazy' Glew on
On 6/24/2010 3:14 PM, Andrew Reilly wrote:
> On Thu, 24 Jun 2010 23:04:00 +0200, Terje Mathisen wrote:
>
>> Using an INTO handler avoids the JNO, making the macro shorter, but it
>> also increases the cost of handling overflows very significantly. :-(
>
> Particularly if your code wants to be able to do different fix-ups for
> overflows from different-sized arguments...


I see very little reason why, on a machine that I got to design rather than inherit, this would not have the cost of a
mispredicfted branch plus a function call.

I envisage user level trap handlers as indirect calls to addresses in special registers.

Perhaps/perhaps not on the current stack. I prefer on the current stack - makes lightweight threading easy - can have
same handler for multiple threads.

Note: trap handlers. Not asynchronous interrupt handlers. The latter can arrive anywhere; traps arrive only at well
known places. Motivates an opcide bit to say "trap/don't trap".
From: mac on
>> Particularly if your code wants to be able to do different fix-ups
> > for
>> overflows from different-sized arguments...
>
> Ouch indeed:
>
> Reach back and disassemble the ADD/SUB/whatever instruction that
> generated the INTO, figure out the operand size and target register,
> and then fix it all: Neither easy nor fast.

Which points out another cost of CISCy instruction encodings. It's not
just the processor thaf has to parse them. It's any binutil.
Which might also include native code secure sandboxing.
--
mac the naïf
From: Andy 'Krazy' Glew on
On 6/25/2010 4:26 PM, MitchAlsup wrote:
> On Jun 25, 12:52 am, Terje Mathisen<"terje.mathisen at tmsw.no">

> But why is this an integer overflow and not an integer underflow?

I just got flamed (in email) for saying "integer undeflow" rather than "negative overflow". Some

>
> That is:: most arrays are from 0:max-1 or 1:max.
> The most common form of buffer overrun is buffer[max+small_int]
> The second most common form is buffer[-small_int]
> Both of these are detected by bounds checking
> Neighter of these is detected by integer(32-bit) overflow or
> underflow.
> Nor is the second form detected by unsigned underflow {where
> small_positive becomes small_negative}
> Additionally, bounds checking is one of those algorithms that one can
> teach the optimizer of a compiler to recognize and minimize the
> overhead possible down into the noise floor without loosing any safety/
> robustness.
>
> Which leads me to a vastly better question:
>
> Why do application programmers use int (integers) to index arrays?
> Should not these actually be unsigned int? It seems to me that one
> should have to justify using a signed value by the necessity of
> needing negative value at some time during its computation.


Actually, in the programming circles (newsgroups and wiki pages) that I lurk on - the sorts of places where people
debate Scott Meyers "Effective C++: 55 Specific Ways to Improve Your Programs and Designs" the trend is to say "Never
use unsigned - always use signed."

Although I can't remember which side of this argument Scott was on.

Overall, unsigned just seems to be a way to save a bit. Now that we are not short of bits, just make all integers
signed. It eliminates a whole class of bugs.