RISC load-store verses x86 Add from memory. [Computer Architecture]

Prev: Call for benchmarks: proposals by 30 June
Next: Vaporizing dust during chip manufacturing ?

From: Terje Mathisen "terje.mathisen at on 23 Jun 2010 08:45

Andrew Reilly wrote:
> On Wed, 23 Jun 2010 10:28:56 +0200, Terje Mathisen wrote:
>
>> Thomas Womack wrote:
>>> In
>>> article<a6d4ec20-9052-4003-
> a3c7-486885d791a4(a)q12g2000yqj.googlegroups.com>,
>>> MitchAlsup<MitchAlsup(a)aol.com> wrote:
>>>> # define sat_add(a,b) (((tmp =3D (a)+(b)), (tmp> SAT_MAX ? SAT_MAX:
>>>> (tmp< SAT_MIN ? SAT_MIN : tmp)))
>>>
>>> And what type is 'tmp'?
>>
>> Any signed type with at least one more bit of precision than a or b? :-)
>
> I think that I mentioned using extended precision in my first post. Not
> always possible or useful. Not all C compilers for 32-bit processors can

I was responding to the other part of the question really, i.e. I
suggested that the proper way to do it on HW which doesn't support
saturation is to work as much as possible in high enough precision to
avoid the need to check for overflow, then do it once at the end
whilestarting the final data back.

> do useful "long long", although most can, and the number that can't is
> decreasing. This doesn't help if a and b are already the largest type
> available. Not all of the processors that I (help to) support are 32-
> bit, either: some 16, some 24, some 64, some word-addressed.

Those tend to be DSPs, right?

They very often have stuff like a 40-bit accumulator for 16-bit data:
Sufficient to accumulate ~256 operations on 16x16 bit multiplication
results.
>
> Why is it so impossible to imagine that signed integer arithmetic might
> overflow? Especially in languages that don't have lisp-style graceful
> bignum-fallback, which is just about everything. That's weird 1984-style
> nu-think.

I agree totally.

I must admit though that I have pretty much given up on (portable) C
support for signed ints: I try to do as much work as possible using
unsigned, and will even use macros to generate sign-extended data. :-(

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

From: Andy 'Krazy' Glew on 23 Jun 2010 11:06

On 6/22/2010 3:41 PM, Andrew Reilly wrote:

> I admit that the overflow() idea was daft, but defining operations for
> signed arithmetic that actually captures what the hardware does doesn't
> seem unreasonable, or too inimical to optimisation. You certainly
> haven't made a convincing argument along those lines.
>
> Saying "signed integer overflow is just wrong" is unhelpful: it happens,
> it's what the hardware does, and some reasonable algorithms want to know
> about it when it happens.

I like overflow(x+y), as in

uint8_t x,y, sum;
if( overflow(x+y) ) sum = MAXINT; else sum = x+y;

Of course, it doesn't fit with the language. But maybe we can tweak it

How about overflow<uint8_t>(x+y) ?

Close, but in the normal run of things this would evaluate (x+y) in uint8_t, and then do the overflow test - too late.

What we really want is something like

overflow<uint8_t,uint16_t>(x+y)

or

overflow<uint8_t>( static_cast<uint16_t>(x) + static_cast<uint16_t>(y) )

which, I suppose, you can already do yourself - but certainly is painful.

From: Andy 'Krazy' Glew on 23 Jun 2010 11:09

On 6/23/2010 12:15 AM, nmm1(a)cam.ac.uk wrote:
> The former, no, but you are wrong with the latter. The point is that
> you can't do any significant code rearrangement if you want to either
> 'capture what the hardware does' or produce deterministic results.
> That shows up much more clearly with IEEE 754, but the same applies
> to integers once you do anything non-trivial or have (say) a twos'
> complement model with an overflow flag (i.e. like IEEE 754).

2's complement WITHOUT an overflow flag CAN be rearranged significantly.

This is a "Just think about it" issue.

(I think I recently posted about optimizations for machines that did not do sign extension on 8, 16, or 32 bit loads.)

From: Andy 'Krazy' Glew on 23 Jun 2010 11:18

On 6/22/2010 11:03 AM, George Neuner wrote:
> On Tue, 22 Jun 2010 07:35:30 -0700, Andy 'Krazy' Glew
> <ag-news(a)patten-glew.net> wrote:
>
>> More commonly: load with sign extension is usually slower than
>> loading without sign extension [*], since in normal
>> representation it involves waiting for the MSB and smearing it
>> over many upper bits. So many new instruction proposals have
>> proposed doing away with signed loads.
>
> I guess the question is: is a sign-extended load faster than code that
> zeros the register, performs a short load, tests the high bit of the
> value and possibly ANDs the value with (2's complement) -1?
>
> Depending on the ISA that's 4-7 instructions vs 1.

The usual proposal has been to provide a zero-extended load, and then a separate register to register instruction to
sign extend from the 8th, 16th, or 32nd bit. Like x86 MOVSX reg,reg. 2 instructions.

The other usual proposal has been to do a zero extended load, shift right (e.g. 24), and then do a sign extending
arithmetic shift right. 3 instructions. However, depending on your machine, shifts may be a single cycle, or they may
be 4-7 cycles (try that on Willamette).

Your 4-7 instruction code alternative seems almost gratuitously slow.

From: Andy 'Krazy' Glew on 23 Jun 2010 11:31

On 6/22/2010 9:24 AM, MitchAlsup wrote:

> But I think the actual problem is not overflow and underflow per-se,
> but that the (access) computations are not bounds checked. Although
> there certainly are situations where overflow/underflow cause
> problems, overflow and underflow are a subset of the bounds checking
> that (some argue) should be taking place in all/most accesses. Some
> like overflow and underflow detection because its cheap--so cheap it
> has past out of being available in any consistent form. Bounds
> checking was unavoidable in the 'descriptor' machines of the later
> 1960s.

I spent 2005-2009 working on bounds checking. 2006-2008 almost full time. (And I have spent much time before that,
although I think that you had left Gould, Mitch, before the "5th generation" work of the late 1980s.)

I agree that bounds checking (buffer overflow) is the most important problem (that hardware has much chance of helping
with [*]).

But integer overflow comes up right behind it. Sure, usually the problem is that integer overflow leads to a buffer
overflow. However, the integer overflow that causes the buffer overflow may have occurred a long time ago. Programmers
usually like to find bugs as early as possible, rather than much later where you can't figure out the problem. Also,
buffer overflows caused by integer overflows are not the only source of security flaws. Occasionally there are security
flaws caused by integer overflows that do not cause buffer overflows, such as

a) indexing, within bounds, into a string - producing a mucked up string. Like, giving you somebody else's login account.

b) if statements: if( overflowing_integer_value < threshold ) do_something_that_requires_security.

First | Prev | Next | Last
Pages: 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Prev: Call for benchmarks: proposals by 30 June
Next: Vaporizing dust during chip manufacturing ?