Endianness of padded scalar objects [Visual C]

Prev: why the different results?
Next: Endianness of padded scalar objects - Correction

From: Ray Mitchell on 24 Feb 2010 20:39

"Igor Tandetnik" wrote:

> Ray <Ray(a)discussions.microsoft.com> wrote:
> > Here is an issue I should know, but I just realized I'm not quite
> > sure. Of course, the endianness of a multibyte scalar object is
> > defined by whether the least significant byte occupies the lowest
> > address (little endian) or the highest address (big endian), (I'm not
> > concerned about "middle endian" here). And of course, taking the
> > "sizeof" any object produces a count of the number of bytes of
> > storage used by that object. For most scalar types on most
> > implementations all of those storage bytes are used to actually
> > represent the object's value. That is, a 4-byte int actually
> > occupies exactly 4 bytes of storage, an 8-byte double actually
> > occupies 8 bytes of storage. However, in some cases more storage is
> > used for an object than is actually used to represent the object's
> > value. For example, only 10 or 12 bytes may be needed to represent
> > the value of a long double, but on some implementations 6 or 4
> > additional bytes of padding may be used to enforce 16-byte memory
> > alignment, and when such a padded object is written to a file, all
> > padding is included. Assuming that my description is accurate, my
> > concern is regarding the appropriate way to reverse the endian of
> > such an object.
>
> I don't know of any architecture where sizeof(long double) == 16,

How about this link - Check out the -m96bit-long-double and
-m128bit-long-double options:

http://developer.apple.com/mac/library/DOCUMENTATION/DeveloperTools/gcc-4.0.1/gcc/i386-and-x86_002d64-Options.html

> let alone two of them that differ in endianness.

I'm not clear on this part of your statement. Of course endianness will be
the same for all objects on a given type of processor, but can differ between
different types of processors.

> If you know of such machines, and you find yourself in an unenviable position of having to exchange binary data between them, you should consult their accompanying manuals, which hopefully would explain precisely how those long double values are laid out.
>
> > Assuming I am correct, I'm at a loss for a simple portable way to
> > determine how many bytes are used for the value and how many are used
> > for padding.
>
> There is no portable way to determine endianness of the machine to begin with, padding or no padding.

Assuming that at least some integral scalar types are more than one byte
(which I don't believe is actually required by the language standards), I
always thought the following type of thing could be used portably to
determine endiness if no padding is present in the object:

void DetermineEndian()
{
union
{
long obj;
char bytes[sizeof(obj)];
} test = { 1 };

if (test.bytes[0] == 1)
cout << "Addressing is right-to-left (little endian)\n";
else if (test.bytes[sizeof(int) - 1] == 1)
cout << "Addressing is left-to-right (big endian)\n";
else
cout << "Addressing is strange (weird endian?)\n";
}

> Binary layout is necessarily machine-specific, and a program relying on any particular layout is non-portable. You keep talking about endianness: what about machines that use sign-magnitue or one's complement to represent signed integers (as opposed to two's complement used by most modern architectures)?

Of course there are numerous portability considerations including those you
mention and several others. However, my concern here is only regarding
whether padding bytes in a scalar object should be involved in an endian byte
swap, which I believe they probably shouldn't.

> --
> With best wishes,
> Igor Tandetnik
>
> With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925
>
> .
>

From: Barry Schwarz on 24 Feb 2010 21:28

On Wed, 24 Feb 2010 18:06:30 -0500, "Igor Tandetnik"
<itandetnik(a)mvps.org> wrote:

>I don't know of any architecture where sizeof(long double) == 16, let alone two of them that differ in endianness. If you know of such machines,
>and you find yourself in an unenviable position of having to exchange binary data between them, you should consult their accompanying manuals,
>which hopefully would explain precisely how those long double values are laid out.

Try the entire IBM zArchitecture family

--
Remove del for email

From: Igor Tandetnik on 24 Feb 2010 21:41

Ray Mitchell <RayMitchell_NOSPAM_(a)MeanOldTeacher.com> wrote:
> "Igor Tandetnik" wrote:
>> I don't know of any architecture where sizeof(long double) == 16,
>
> How about this link - Check out the -m96bit-long-double and
> -m128bit-long-double options:
>
> http://developer.apple.com/mac/library/DOCUMENTATION/DeveloperTools/gcc-4..0.1/gcc/i386-and-x86_002d64-Options.html

Do these switches affect sizeof(long double), or just __alignof(long double) ?

>> let alone two of them that differ in endianness.
>
> I'm not clear on this part of your statement. Of course endianness
> will be the same for all objects on a given type of processor, but
> can differ between different types of processors.

Do there exist two architectures that a) differ in endianness, and b) both have sizeof(long double) == 16 ?

If anything, I'd be more concerned about transferring data between two machines where sizeof(long double) itself is different, padding and endianness aside.

>> There is no portable way to determine endianness of the machine to
>> begin with, padding or no padding.
>
> Assuming that at least some integral scalar types are more than one
> byte (which I don't believe is actually required by the language
> standards), I always thought the following type of thing could be
> used portably to determine endiness if no padding is present in the
> object:
>
> void DetermineEndian()
> {
> union
> {
> long obj;
> char bytes[sizeof(obj)];
> } test = { 1 };
>
> if (test.bytes[0] == 1)

Assigning to one member of the union and then reading another exhibits undefined behavior.

> Of course there are numerous portability considerations including
> those you mention and several others. However, my concern here is
> only regarding whether padding bytes in a scalar object should be
> involved in an endian byte swap, which I believe they probably
> shouldn't.

If you can find two architectures that are actually affected by the problem, you can study their documentation and find out (at least for this one case).
--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

From: Igor Tandetnik on 24 Feb 2010 21:57

Barry Schwarz <schwarzb(a)dqel.com> wrote:
> On Wed, 24 Feb 2010 18:06:30 -0500, "Igor Tandetnik"
> <itandetnik(a)mvps.org> wrote:
>
>> I don't know of any architecture where sizeof(long double) == 16,
>> let alone two of them that differ in endianness. If you know of such
>> machines, and you find yourself in an unenviable position of having
>> to exchange binary data between them, you should consult their
>> accompanying manuals, which hopefully would explain precisely how
>> those long double values are laid out.
>
> Try the entire IBM zArchitecture family

Well, according to

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/download/A2278325.pdf?DT=20070807125005&XKS=DZ9ZBK07

page 19-2, this architecture does indeed provide for 16-byte-large floating point numbers, but they don't have any padding inside - all bytes are significant. Also, do z/Architecture machines come in both little-endian and big-endian flavor?
--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

From: Ray on 25 Feb 2010 01:57

"Igor Tandetnik" wrote:

> Do there exist two architectures that a) differ in endianness, and b) both have sizeof(long double) == 16 ?

I don't know, but my original question was intended to be theoretical.
Maybe there is no single answer, but only an imlementation-dependent answer.

> If anything, I'd be more concerned about transferring data between two machines where sizeof(long double) itself is different, padding and endianness aside.

Yes, but my only concern at this point is regarding the endian swapping issue.

> > void DetermineEndian()
> > {
> > union
> > {
> > long obj;
> > char bytes[sizeof(obj)];
> > } test = { 1 };
> >
> > if (test.bytes[0] == 1)
>
> Assigning to one member of the union and then reading another exhibits undefined behavior.

Where did you get this information? Could you please refer me to the
appropriate section of the C standard that states this is the case? I
searched through the C99 standard and could find nothing the either directly
stated nor implied this undefined behavior. Logically, to me at least, since
all union members start at the same address, examining the bytes of only the
most recently written member via a character pointer should yield perfectly
valid results, and that is what I am doing. And even if what you state is
true I could simply set a separate character pointer equal to the address of
the entire union and examine the individual bytes that way, thereby not
reading using another member.

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: why the different results?
Next: Endianness of padded scalar objects - Correction