RISC load-store verses x86 Add from memory. [Computer Architecture]

Prev: Call for benchmarks: proposals by 30 June
Next: Vaporizing dust during chip manufacturing ?

From: EricP on 24 Jun 2010 02:18

Terje Mathisen wrote:
> Thomas Womack wrote:
>> In
>> article<a6d4ec20-9052-4003-a3c7-486885d791a4(a)q12g2000yqj.googlegroups.com>,
>>
>> MitchAlsup<MitchAlsup(a)aol.com> wrote:
>>> # define sat_add(a,b) (((tmp =3D (a)+(b)), (tmp> SAT_MAX ? SAT_MAX:
>>> (tmp< SAT_MIN ? SAT_MIN : tmp)))
>>
>> And what type is 'tmp'?
>
> Any signed type with at least one more bit of precision than a or b?
> :-)
>
> Terje
>

Ok, a signed (2's complement) overflow detect is:
sum = a + b;
overflow = ((a^sum)&(b^sum)) < 0;

so....

#define sat_add(a,b) ((sum=(a)+(b)), (((a)^sum)&((b)^sum)) < 0 ? \
(sum < 0 ? SAT_MAX : SAT_MIN) : sum)

Good compiler would use CMOVs so no branches.

Eric

From: Andrew Reilly on 24 Jun 2010 04:48

On Wed, 23 Jun 2010 14:45:44 +0200, Terje Mathisen wrote:
> Andrew Reilly wrote:
>> Not all of the processors that I (help to) support are 32-
>> bit, either: some 16, some 24, some 64, some word-addressed.
>
> Those tend to be DSPs, right?

Mostly, yes. Modern DSPs actually tend to have pretty good C compilers
(and several of them are byte-addressable, just to make that easier.)
Certainly good enough to get started, anyway.

> They very often have stuff like a 40-bit accumulator for 16-bit data:
> Sufficient to accumulate ~256 operations on 16x16 bit multiplication
> results.

Yes. Interestingly, even the ones with 40-bit accumulators tend to be
quite well supported by C these days.

Cheers,

--
Andrew

From: Andrew Reilly on 24 Jun 2010 04:53

On Wed, 23 Jun 2010 08:06:38 -0700, Andy 'Krazy' Glew wrote:

> overflow<uint8_t>( static_cast<uint16_t>(x) + static_cast<uint16_t>(y)
> )
>
> which, I suppose, you can already do yourself - but certainly is
> painful.

I can see where you're going here, and the equivalent as a cpp macro in C
would be almost as ugly. I don't do C++, as a rule. I won't insert my
usual C++ rant here, but I'm sure you've heard it before. The target
support issue is still the big one for me, luckily ;-)

Cheers,

--
Andrew

From: Andy 'Krazy' Glew on 24 Jun 2010 08:42

On 6/23/2010 11:16 PM, Mike Hore wrote:
> Andy 'Krazy' Glew wrote:

> Even assuming you fix the expression so it works, would it still survive
> an optimization that takes "undefined on overflow" to mean "assume
> overflow can't happen" - as we were discussing earlier?

No.

It would just be a long and complicated expression that, potentially, a compiler could recognize as an idiom for
overflow detection, on a machine with weighted binary unsigned (and, hopefully, signed) representation.

And, one would hope, that the compiler might choose to optimize to use any better overflow detection mechanism available
on the hardware.

From: Andy 'Krazy' Glew on 24 Jun 2010 09:00

On 6/23/2010 7:45 PM, Andy 'Krazy' Glew wrote:
> On 6/23/2010 11:29 AM, Terje Mathisen wrote:
>> Andy 'Krazy' Glew wrote:
>>> Whereas, if you use the normal behaviour of 2's complement integers
>>> (signed - what does it mean to say that a 2's complement number is
>>> unsigned?)
>>>
>>> #define sat_add(a,b)
>>> ((typeof<a>(a+b)>(a))&&(typeof<a>(a+b)>(b))?(a+b):SAT_MAX)
>>>
>>> works for all 2's complement types. signed. and, yes, unsigned.
>>
>> Huh???
>>
>> What happens when both a and b are negative?
>>
>> (-1 + -2) is less than both -1 and -2, so both parts of that test will
>> agree that the proper answer is SAT_MAX: Probably not what you want!
>>
>> The next issue is of course when you do (-100 + -100) with 8-bit values
>> and end up with +56 instead of -200 or a saturated -128.
>>
>> Terje
>>
>
>
> I know. The expression got too long for me to write in the margins of
> cmp.arch.

I was half expecting somebody to take up the gauntlet, and start a little contest for the shortest expression in
not-really-portable C assuming conventional non-redundant weighted binary unsigned and 2's complement signed arithmetic,
with wrap - and, in the above example, saturate, although in other examples maybe do something else.

(Of course, saturation is just one of several actions you can take on overflow detection.)

Case analysis a >= 0, b>=, etc.

Overflow ::

a >= 0 && b >= 0 ::

a+b >= a ==> no overflow
a+b < a ==> overflow

a < 0 && b >= 0 ::
a >= 0 && b < 0 ::
no overflow
(although you can have underflow,
negative overflow, which is handled
similarly)

a < 0 && b < 0
no overflow
(although you can have underflow,
negative overflow, which is handled
similarly)

First | Prev | Next | Last
Pages: 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Prev: Call for benchmarks: proposals by 30 June
Next: Vaporizing dust during chip manufacturing ?