From: Wes on
I've been playing around with hpgcc and after wondering why some of my
unions were not behaving as I expected, I noticed something
interesting about how doubles are stored in memory on the ARM. This
may common knowledge, but it was new to me.

As one would expect on a Little Endian machine, the 64 bit unsigned
long long
0x 12345678 9ABCDEF0 is stored in memory as: f0 de bc 9a 78 56 34 12

However, the 64 bit double 1.24 whose IEEE754 representation is
0x 3FF3D70A 3D70A3D7 is stored in memory as: 0a d7 f3 3f d7 a3 70 3d

The 32 bit words are in Big Endian order, but the bytes within the
words are in Little Endian order.

I see http://en.wikipedia.org/wiki/Endianness says:
"On some machines, while integers were represented in little-endian
form, floating-point numbers were represented in big-endian form."

But this is neither one. I understand the reasons behind Big or Little
Endian, but what's the reasoning behind a mixed endian format? And
why on one data type but not another? Is this common on other CPUs?

As I was looking up information on this I came across this interesting
article:
http://www.linuxdevices.com/articles/AT5920399313.html

-wes
From: Claudio Lapilli on
Hi,

On Jan 13, 2:03 pm, Wes <wjltemp...(a)yahoo.com> wrote:
> I've been playing around with hpgcc and after wondering why some of my
> unions were not behaving as I expected, I noticed something
> interesting about how doubles are stored in memory on the ARM.  This
> may common knowledge, but it was new to me.

It's not so common, I don't think a lot of people are aware of this,
and it's very strange but that's how the gcc
compiler for ARM treats doubles: (2) 32-bit words stored in big-endian
format. I ran across this issue when trying to implement direct
conversion of 'doubles' to calculator reals.

>
> As one would expect on a Little Endian machine, the 64 bit unsigned
> long long
> 0x 12345678 9ABCDEF0 is stored in memory as: f0 de bc 9a  78 56 34 12
>
> However, the 64 bit double 1.24 whose IEEE754 representation is
> 0x 3FF3D70A 3D70A3D7 is stored in memory as: 0a d7 f3 3f  d7 a3 70 3d
>
> The 32 bit words are in Big Endian order, but the bytes within the
> words are in Little Endian order.
>
> I seehttp://en.wikipedia.org/wiki/Endiannesssays:
> "On some machines, while integers were represented in little-endian
> form, floating-point numbers were represented in big-endian form."
>
> But this is neither one. I understand the reasons behind Big or Little
> Endian, but what's the reasoning behind a mixed endian format?  And
> why on one data type but not another?  Is this common on other CPUs?

What happens is that the ARM processor can be set in either big endian
or little endian mode. The compiler's developers apparently chose the
big endian format for doubles regardless of the processor endianness.
But the "store" operation happens in (2) 32-bit words. The compiler
stores first the hi word and then the lo word (big endian *ALWAYS*),
but if the processor is in little endian mode (which is true in the
calculator), each word is internally stored in little endian mode. If
the ARM processor is set in big endian mode, you would have a "true"
big-endian representation.

This is a very weird behavior that should have been avoided (maybe a
compiler switch to use either fully little endian or fully big-endian
but not this mix). As you already discovered, when you use unions to
"break" your doubles into bytes you'll have to account for that or
you'll get a headache.
And if you intend to use the same code on a PC and the calculator,
then you better #define some macros to reverse the words
automatically.


Claudio
From: Wes on
On Jan 13, 10:31 pm, Claudio Lapilli <pleasedonts...(a)isp.com> wrote:
> It's not so common, I don't think a lot of people are aware of this,
> and it's very strange but that's how the gcc
> compiler for ARM treats doubles: (2) 32-bit words stored in big-endian
> format. I ran across this issue when trying to implement direct
> conversion of 'doubles' to calculator reals.

That's just what I was experimenting with.

> If the ARM processor is set in big endian mode,
> you would have a "true" big-endian representation.

I noticed this by compiling with -mbig-endian -save-temps and looking
at the assembly .s file.

> This is a very weird behavior that should have been avoided (maybe a
> compiler switch to use either fully little endian or fully big-endian
> but not this mix).

Sounds to me like it was more of an ARM issue than a gcc issue. If I
understand correctly, the optional ARM floating point hardware had
this mixed-endian behavior. The linux gcc arm compiler produces code
that it uses this hardware if available and emulation if it is not, so
that the same executable can run on both machines. This requires gcc
to follow ARM's word order convention. Although I suppose if you knew
it was always going to be emulated, you could do whatever you wanted
with the word order.

It reminds me of the 8088/286/386 days when the 8087/287/387
coprocessors where optional. The MS-C compiler had options to
generating code which either:
a) required the coprocessor
b) used the coprocessor if present and emulated it if not present
c) used a faster non-coprocessor-compatible software code

(I remember the day I put a 287 coprocessor in my 286-10MHz screamer
-- fractals had met their match.)

-wes