Endianness of padded scalar objects [Visual C]

Prev: why the different results?
Next: Endianness of padded scalar objects - Correction

From: Ray on 24 Feb 2010 15:54

Hello,

Here is an issue I should know, but I just realized I'm not quite sure. Of
course, the endianness of a multibyte scalar object is defined by whether the
least significant byte occupies the lowest address (little endian) or the
highest address (big endian), (I'm not concerned about "middle endian" here).
And of course, taking the "sizeof" any object produces a count of the number
of bytes of storage used by that object. For most scalar types on most
implementations all of those storage bytes are used to actually represent the
object's value. That is, a 4-byte int actually occupies exactly 4 bytes of
storage, an 8-byte double actually occupies 8 bytes of storage. However, in
some cases more storage is used for an object than is actually used to
represent the object's value. For example, only 10 or 12 bytes may be needed
to represent the value of a long double, but on some implementations 6 or 4
additional bytes of padding may be used to enforce 16-byte memory alignment,
and when such a padded object is written to a file, all padding is included.
Assuming that my description is accurate, my concern is regarding the
appropriate way to reverse the endian of such an object. Obviously for an
object that uses all of its storage to represent its value, reversing endian
simply amounts to exchanging the lowest-addressed storage byte with the
highest-addressed storage byte and working your way toward the middle,
something like the code that follows. However, in the case of an object that
doesn't use all of its storage to represent its value, the padding byte(s)
will be swapped with some of the value bytes, which I believe is not what is
desired. Instead, the swap should begin with the highest-addressed byte that
is actually used for the object's value, totally ignoring the padding bytes
themselves. Am I correct in this assumption? Assuming I am correct, I'm at
a loss for a simple portable way to determine how many bytes are used for the
value and how many are used for padding. Of course this can be determined by
looking at the compiler's documentation, but then portability goes out the
window. I can also envision using some bit shifting scheme to determine
this, but this would only work for objects with integral types. What do you
think?

Thanks,
Sonny

void *ReverseEndian(void *p, size_t size)
{
char *head = (char *)p;
char *tail = head + size - 1;

for (; tail > head; --tail, ++head)
{
char temp = *head;
*head = *tail;
*tail = temp;
}
return p;
}

int main(void)
{
int x;
long double y;

ReverseEndian((void *)&x, sizeof(x));
ReverseEndian((void *)&y, sizeof(y));

return 0;
}

From: David Lowndes on 24 Feb 2010 17:17

>For example, only 10 or 12 bytes may be needed
>to represent the value of a long double, but on some implementations 6 or 4
>additional bytes of padding may be used to enforce 16-byte memory alignment,
>and when such a padded object is written to a file, all padding is included.

I'm pretty sure that shouldn't be the case - do you have a specific
example where it is?

Dave

From: Igor Tandetnik on 24 Feb 2010 18:06

Ray <Ray(a)discussions.microsoft.com> wrote:
> Here is an issue I should know, but I just realized I'm not quite
> sure. Of course, the endianness of a multibyte scalar object is
> defined by whether the least significant byte occupies the lowest
> address (little endian) or the highest address (big endian), (I'm not
> concerned about "middle endian" here). And of course, taking the
> "sizeof" any object produces a count of the number of bytes of
> storage used by that object. For most scalar types on most
> implementations all of those storage bytes are used to actually
> represent the object's value. That is, a 4-byte int actually
> occupies exactly 4 bytes of storage, an 8-byte double actually
> occupies 8 bytes of storage. However, in some cases more storage is
> used for an object than is actually used to represent the object's
> value. For example, only 10 or 12 bytes may be needed to represent
> the value of a long double, but on some implementations 6 or 4
> additional bytes of padding may be used to enforce 16-byte memory
> alignment, and when such a padded object is written to a file, all
> padding is included. Assuming that my description is accurate, my
> concern is regarding the appropriate way to reverse the endian of
> such an object.

I don't know of any architecture where sizeof(long double) == 16, let alone two of them that differ in endianness. If you know of such machines, and you find yourself in an unenviable position of having to exchange binary data between them, you should consult their accompanying manuals, which hopefully would explain precisely how those long double values are laid out.

> Assuming I am correct, I'm at a loss for a simple portable way to
> determine how many bytes are used for the value and how many are used
> for padding.

There is no portable way to determine endianness of the machine to begin with, padding or no padding. Binary layout is necessarily machine-specific, and a program relying on any particular layout is non-portable. You keep talking about endianness: what about machines that use sign-magnitue or one's complement to represent signed integers (as opposed to two's complement used by most modern architectures)?
--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

From: Ray Mitchell on 24 Feb 2010 20:18

"David Lowndes" wrote:

> >For example, only 10 or 12 bytes may be needed
> >to represent the value of a long double, but on some implementations 6 or 4
> >additional bytes of padding may be used to enforce 16-byte memory alignment,
> >and when such a padded object is written to a file, all padding is included.
>
> I'm pretty sure that shouldn't be the case - do you have a specific
> example where it is?
>
> Dave
> .
>

Here is an example of gcc running on a Mac with an Intel/AMD processor.
Specifically, see the -m96bit-long-double and -m128bit-long-double options:

http://developer.apple.com/mac/library/DOCUMENTATION/DeveloperTools/gcc-4.0.1/gcc/i386-and-x86_002d64-Options.html

From: Igor Tandetnik on 24 Feb 2010 20:31

Ray Mitchell <RayMitchell_NOSPAM_(a)MeanOldTeacher.com> wrote:
> "David Lowndes" wrote:
>
>>> For example, only 10 or 12 bytes may be needed
>>> to represent the value of a long double, but on some
>>> implementations 6 or 4 additional bytes of padding may be used to
>>> enforce 16-byte memory alignment, and when such a padded object is
>>> written to a file, all padding is included.
>>
>> I'm pretty sure that shouldn't be the case - do you have a specific
>> example where it is?
>>
>> Dave
>> .
>>
>
> Here is an example of gcc running on a Mac with an Intel/AMD
> processor. Specifically, see the -m96bit-long-double and
> -m128bit-long-double options:
>
> http://developer.apple.com/mac/library/DOCUMENTATION/DeveloperTools/gcc-4..0.1/gcc/i386-and-x86_002d64-Options.html

Do these switches affect sizeof(long double), or just __alignof(long double) ?
--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

| Next | Last
Pages: 1 2 3 4 5 6
Prev: why the different results?
Next: Endianness of padded scalar objects - Correction