From: Nick Maclaren on

In article <xmqKj.44591$5i5.32691(a)newsfe6-gui.ntli.net>,
"Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes:
|>
|> > Just exactly HOW does reordering bits change the space they take or
|> > their alignment?
|>
|> Combine non-pure endian types with bitfields. It should be easy to work out
|> an example (remember you thought it was trivial?). Alignment never changes.

Eh? I think that you are seriously confused.

|> There isn't any stdint.h controversy. Everybody used fixed-width types like
|> int32 long before stdin.h and continues to do so. It's basic software engineering
|> as you say. The only thing that sometimes causes discussion are the redundant
|> fast_xxx and least_xxx types, but few use them anyway.

You clearly weren't involved in the standardisation of C99. I was.

Basic software engineering is NOT to use those abominations, where
portability between systems or to future versions of the same system
is even a minor objective. Clean code runs without changes and without
disgusting preprocessor grobble on 32- and 64-bit systems (and would
to on 16-, 24-, 60- and other-bit systems).


Anyway, this may be about language architecture, but I am sure is
boring most people. I won't continue.


Regards,
Nick Maclaren.
From: Duane Rettig on
"Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes:

> "Nick Maclaren" <nmm1(a)cus.cam.ac.uk> wrote in message news:ftd6np$fhn$1(a)gemini.csx.cam.ac.uk...
>>
>> In article <cNoKj.13308$h65.12966(a)newsfe2-gui.ntli.net>,
>> "Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes:
>> |>
>> |> > "Adding a macro here and there" is precisely what I meant by doing it
>> |> > wrong. And, no, endianness does not change the sizes of anything - but
>> |> > genuinely portable code can handle size, endian and other differences.
>> |>
>> |> Changing endian does change structure size, this is exactly the kind of thing
>> |> that catches the less experienced people :-)
>>
>> Just exactly HOW does reordering bits change the space they take or
>> their alignment?
>
> Combine non-pure endian types with bitfields. It should be easy to work out
> an example (remember you thought it was trivial?). Alignment never changes.

I'd like to see this. So go ahead and work us out an example, since
it's so trivial. I think you're going to have a hard time, unless you
introduce other factors that were not directly related to endianness.

>> No, you don't - standard software engineering (including from long before
>> that term was invented). This is not the group to explain why, but think
>> of validity checking, tracing and so on. And my remark referred to the
>> stdint.h controversy, which you may or may not know about.
>
> There isn't any stdint.h controversy. Everybody used fixed-width types like
> int32 long before stdin.h and continues to do so. It's basic software engineering
> as you say. The only thing that sometimes causes discussion are the redundant
> fast_xxx and least_xxx types, but few use them anyway.

Be careful about using absolutes. Not _everybody_ uses fixed-width
types. My product, for example, which uses some C code to interface
to system libraries, and which has a "foreign function" interface to
enabe users to load and call libraries, and we strive always to move
away from any reference to size. Instead, we match the machine model
(e.g. ILP32, LP64, IL32P64, etc) with the names of types, and get
ourselves away from sizes in type names. It makes for extremely
portable code; we've ported to over 25 operating systems over at least
10 architectures over the past 20 years, both endiannesses, 32 and
64-bits, and the C code doesn't change due to sizes/endiannesses (only
due to enhancements and bug fixes).

>> |> So explain how you would test multiple orthogonal choices that have complex
>> |> interactions with each other without testing most of the cross product.
>>
>> See above. You can simplify an NxN problem by converting it into a
>> Nx1 and a 1xN problem.
>
> That is only possible if the choices are truly independent of each other - which
> they aren't in this case. Another well known example is trying to represent
> N languages and M target architectures using a single intermediate language.
> Unfortunately it doesn't work like that in the real world...

We've had experience with this, and it's not the case of intermediate
languages being inherently problematic - it's more the case where the
intermediate language is simply not powerful enough to take on some of
the desired target languages in an efficient manner.

--
Duane Rettig duane(a)franz.com Franz Inc. http://www.franz.com/
555 12th St., Suite 1450 http://www.555citycenter.com/
Oakland, Ca. 94607 Phone: (510) 452-2000; Fax: (510) 452-0182
From: Nick Maclaren on

In article <o0iqyttzes.fsf(a)gemini.franz.com>,
Duane Rettig <duane(a)franz.com> writes:
|> "Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes:
|>
|> >> Just exactly HOW does reordering bits change the space they take or
|> >> their alignment?
|> >
|> > Combine non-pure endian types with bitfields. It should be easy to work out
|> > an example (remember you thought it was trivial?). Alignment never changes.
|>
|> I'd like to see this. So go ahead and work us out an example, since
|> it's so trivial. I think you're going to have a hard time, unless you
|> introduce other factors that were not directly related to endianness.

We would also need to know the language - C90, C99 and C++ are all
slightly different in this area.

|> Instead, we match the machine model
|> (e.g. ILP32, LP64, IL32P64, etc) with the names of types, and get
|> ourselves away from sizes in type names. It makes for extremely
|> portable code; we've ported to over 25 operating systems over at least
|> 10 architectures over the past 20 years, both endiannesses, 32 and
|> 64-bits, and the C code doesn't change due to sizes/endiannesses (only
|> due to enhancements and bug fixes).

Yes. And that has been the solution since time immemorial, including
with floating-point precision in Fortran from the late 1960s onwards.

|> We've had experience with this, and it's not the case of intermediate
|> languages being inherently problematic - it's more the case where the
|> intermediate language is simply not powerful enough to take on some of
|> the desired target languages in an efficient manner.

Or where the target systems are so mind-bogglingly perverse that they
are impossible to map onto a clean model :-( I don't know of any like
that which are in current use, though there may be some.


Regards,
Nick Maclaren.
From: Wilco Dijkstra on

"Duane Rettig" <duane(a)franz.com> wrote in message news:o0iqyttzes.fsf(a)gemini.franz.com...
> "Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes:
>
>> "Nick Maclaren" <nmm1(a)cus.cam.ac.uk> wrote in message news:ftd6np$fhn$1(a)gemini.csx.cam.ac.uk...
>>>
>>> In article <cNoKj.13308$h65.12966(a)newsfe2-gui.ntli.net>,
>>> "Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes:
>>> |>
>>> |> > "Adding a macro here and there" is precisely what I meant by doing it
>>> |> > wrong. And, no, endianness does not change the sizes of anything - but
>>> |> > genuinely portable code can handle size, endian and other differences.
>>> |>
>>> |> Changing endian does change structure size, this is exactly the kind of thing
>>> |> that catches the less experienced people :-)
>>>
>>> Just exactly HOW does reordering bits change the space they take or
>>> their alignment?
>>
>> Combine non-pure endian types with bitfields. It should be easy to work out
>> an example (remember you thought it was trivial?). Alignment never changes.
>
> I'd like to see this. So go ahead and work us out an example, since
> it's so trivial. I think you're going to have a hard time, unless you
> introduce other factors that were not directly related to endianness.

No problem:

struct X {
int x : 16;
long long y : 33;
long long z : 15;
}

Assuming 32-bit int, 64-bit long long, natural alignment and commonly used bitfield
packing rules this structure takes 8 bytes normally. However it needs either 8 or 16
bytes if long longs are mixed endian. A mixed endian type has its low and high parts
always at the same offset, but the bytes in each part are swapped normally.

In the following, let's assume a mixed endian 64-bit type always starts with the low
32-bits word at offset 0 and the high 32-bits at offset 4 (ie. in little endian the words
are in their natural order):

In little endian we get an 8-byte structure as follows: x goes in byte 0 and 1, y goes in
bytes 2,3,4,5 and one bit in byte 6. z uses the remaining bits of byte 6 and byte 7.

Allocation of bitfields is reversed in big-endian, so x uses the top bits of the 32-bit
container, which is byte 0 and 1. We try to allocate 33 bits in a 64-bit bitfield next.
Allocation starts in the top bits, which is byte 4 of the mixed endian type. We can
allocate 32 bits and then wrap around into the low word (bytes 0..3). The problem
however is we've already used bytes 0 and 1 of the 64-bit container, so the biggest
bitfield that could fit is 32 bits (we can't put the last bit in byte 3 as that would split
the 33-bit field into two parts). Therefore a new 8-byte container is allocated at
bytes 8-15. y is then allocated at bytes 12-15 plus one bit in byte 8. z goes into the
rest of byte 9 and 10.

> Be careful about using absolutes. Not _everybody_ uses fixed-width
> types. My product, for example, which uses some C code to interface
> to system libraries, and which has a "foreign function" interface to
> enabe users to load and call libraries, and we strive always to move
> away from any reference to size. Instead, we match the machine model
> (e.g. ILP32, LP64, IL32P64, etc) with the names of types, and get
> ourselves away from sizes in type names.

I'm not sure what you mean by matching, but if you assume a particular model
then you are effectively working with sized types, whatever name you use.

>> Another well known example is trying to represent
>> N languages and M target architectures using a single intermediate language.
>> Unfortunately it doesn't work like that in the real world...
>
> We've had experience with this, and it's not the case of intermediate
> languages being inherently problematic - it's more the case where the
> intermediate language is simply not powerful enough to take on some of
> the desired target languages in an efficient manner.

So you add some more specific intermediate instructions and semantics. Do this
for several languages and targets, and you either end up with a huge intermediate
language or complex intermediate instructions whose semantics varies with the context.

The current compiler I'm working on supports 3 different exception models in the
intermediate language, and that's just 3 languages so far... To make matters worse,
each target encodes exception tables different enough that very little can be shared.

Wilco


From: Wilco Dijkstra on

"Nick Maclaren" <nmm1(a)cus.cam.ac.uk> wrote in message news:ftddoc$4fr$1(a)gemini.csx.cam.ac.uk...
>
> In article <xmqKj.44591$5i5.32691(a)newsfe6-gui.ntli.net>,
> "Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes:
> |>
> |> > Just exactly HOW does reordering bits change the space they take or
> |> > their alignment?
> |>
> |> Combine non-pure endian types with bitfields. It should be easy to work out
> |> an example (remember you thought it was trivial?). Alignment never changes.
>
> Eh? I think that you are seriously confused.

See my other post for a simple example.

> |> There isn't any stdint.h controversy. Everybody used fixed-width types like
> |> int32 long before stdin.h and continues to do so. It's basic software engineering
> |> as you say. The only thing that sometimes causes discussion are the redundant
> |> fast_xxx and least_xxx types, but few use them anyway.
>
> You clearly weren't involved in the standardisation of C99. I was.
>
> Basic software engineering is NOT to use those abominations, where

Ah, maybe you were the one causing the controversy? :-)

> portability between systems or to future versions of the same system
> is even a minor objective. Clean code runs without changes and without
> disgusting preprocessor grobble on 32- and 64-bit systems (and would
> to on 16-, 24-, 60- and other-bit systems).

You got it the wrong way around. Using the standard types makes porting more
difficult. It's not impossible to write portable code that way of course, but using
correctly sized types makes porting a lot easier - just change a single typedef.

Wilco