|
Prev: NTP: (Was: Re: performance of hardware dynamic scheduling)
Next: A race-condition in a SUN paper by Detlefs on reference counting...
From: already5chosen on 7 Apr 2008 14:58 On Apr 7, 8:25 pm, "Wilco Dijkstra" <Wilco_dot_Dijks...(a)ntlworld.com> wrote: > "Duane Rettig" <du...(a)franz.com> wrote in messagenews:o0iqyttzes.fsf(a)gemini.franz.com... > > "Wilco Dijkstra" <Wilco_dot_Dijks...(a)ntlworld.com> writes: > > >> "Nick Maclaren" <n...(a)cus.cam.ac.uk> wrote in messagenews:ftd6np$fhn$1(a)gemini.csx.cam.ac.uk... > > >>> In article <cNoKj.13308$h65.12...(a)newsfe2-gui.ntli.net>, > >>> "Wilco Dijkstra" <Wilco_dot_Dijks...(a)ntlworld.com> writes: > >>> |> > >>> |> > "Adding a macro here and there" is precisely what I meant by doing it > >>> |> > wrong. And, no, endianness does not change the sizes of anything - but > >>> |> > genuinely portable code can handle size, endian and other differences. > >>> |> > >>> |> Changing endian does change structure size, this is exactly the kind of thing > >>> |> that catches the less experienced people :-) > > >>> Just exactly HOW does reordering bits change the space they take or > >>> their alignment? > > >> Combine non-pure endian types with bitfields. It should be easy to work out > >> an example (remember you thought it was trivial?). Alignment never changes. > > > I'd like to see this. So go ahead and work us out an example, since > > it's so trivial. I think you're going to have a hard time, unless you > > introduce other factors that were not directly related to endianness. > > No problem: > > struct X { > int x : 16; > long long y : 33; > long long z : 15; > > } > > Assuming 32-bit int, 64-bit long long, natural alignment and commonly used bitfield > packing rules this structure takes 8 bytes normally. However it needs either 8 or 16 > bytes if long longs are mixed endian. A mixed endian type has its low and high parts > always at the same offset, but the bytes in each part are swapped normally. > Doesn't the recommendation to avoid bit fields if you are interested in the portable data layout appear in every C book in existence.
From: Duane Rettig on 7 Apr 2008 19:31 "Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes: > "Duane Rettig" <duane(a)franz.com> wrote in message news:o0iqyttzes.fsf(a)gemini.franz.com... >> "Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes: >> >>> "Nick Maclaren" <nmm1(a)cus.cam.ac.uk> wrote in message news:ftd6np$fhn$1(a)gemini.csx.cam.ac.uk... >>>> >>>> In article <cNoKj.13308$h65.12966(a)newsfe2-gui.ntli.net>, >>>> "Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes: >>>> |> >>>> |> > "Adding a macro here and there" is precisely what I meant by doing it >>>> |> > wrong. And, no, endianness does not change the sizes of anything - but >>>> |> > genuinely portable code can handle size, endian and other differences. >>>> |> >>>> |> Changing endian does change structure size, this is exactly the kind of thing >>>> |> that catches the less experienced people :-) >>>> >>>> Just exactly HOW does reordering bits change the space they take or >>>> their alignment? >>> >>> Combine non-pure endian types with bitfields. It should be easy to work out >>> an example (remember you thought it was trivial?). Alignment never changes. >> >> I'd like to see this. So go ahead and work us out an example, since >> it's so trivial. I think you're going to have a hard time, unless you >> introduce other factors that were not directly related to endianness. > > No problem: > > struct X { > int x : 16; > long long y : 33; > long long z : 15; > } Ah, so you're giving me non-portable code to prove non-portability. I see. Anyone can indeed write non-portable code, in any language. That point is not at at issue. >> Be careful about using absolutes. Not _everybody_ uses fixed-width >> types. My product, for example, which uses some C code to interface >> to system libraries, and which has a "foreign function" interface to >> enabe users to load and call libraries, and we strive always to move >> away from any reference to size. Instead, we match the machine model >> (e.g. ILP32, LP64, IL32P64, etc) with the names of types, and get >> ourselves away from sizes in type names. > > I'm not sure what you mean by matching, but if you assume a particular model > then you are effectively working with sized types, whatever name you use. Well of course every type has a size. That's also not at issue. The question is if the size matters, and whether the abstraction at the source level can be made portable. For example, we use a typedef called "nat", which is meant to be the natural word size of the program being run. We might have used "long" to represent this type, except that Windows XP-64 uses an IL32P64 model, which means that longs are still 32-bit. But by defining the nat type, we can say what we mean by (an integer that has the same size as a pointer) and we can avoid code that is non-portable and full of conditionalizations. In that sense, we do not use a fixed-width type; we use a type that describes the problem space we're working on in an abstract manner, without it being necessary to reveal its size. >>> Another well known example is trying to represent >>> N languages and M target architectures using a single intermediate language. >>> Unfortunately it doesn't work like that in the real world... >> >> We've had experience with this, and it's not the case of intermediate >> languages being inherently problematic - it's more the case where the >> intermediate language is simply not powerful enough to take on some of >> the desired target languages in an efficient manner. > > So you add some more specific intermediate instructions and semantics. Do this > for several languages and targets, and you either end up with a huge intermediate > language but "huge" is so relative... >> or complex intermediate instructions whose semantics varies with the context. or if you spend the time to refactor the intermediate language, you start refining it to the point where it is able to take on more targets with less added complexity. > The current compiler I'm working on supports 3 different exception models in the > intermediate language, and that's just 3 languages so far... To make matters worse, > each target encodes exception tables different enough that very little can be shared. Perhaps you're just not abstracting your intermediate model deeply enough. -- Duane Rettig duane(a)franz.com Franz Inc. http://www.franz.com/ 555 12th St., Suite 1450 http://www.555citycenter.com/ Oakland, Ca. 94607 Phone: (510) 452-2000; Fax: (510) 452-0182
From: Paul Gotch on 7 Apr 2008 20:58 Duane Rettig <duane(a)franz.com> wrote: > > struct X { > > int x : 16; > > long long y : 33; > > long long z : 15; > > } > Ah, so you're giving me non-portable code to prove non-portability. I > see. That code is standard C99. however how it's represented is different depending on how the compiler chooses to implement bitfields. Just as a C++ compiler can choose the length of enums to fit the declared values with in them. One way of taking a buffer of bytes an interpreting them as a protocol is to cast the pointer to a packed struct however a soon as you this (in fact as soon as you use a compiler extension such as a packed struct) you are making assumptions about the compiler implementation and the underlying machine. This is often worth it if it saves you hundres or thousands of instructions worth of byte accesses, compares and branches. Unfortunately when dealing with something where the typing is poorly defined at compile time somewhere along the line you end up casting it through a void* pointer and making assumptions about the memory layout. Now this may be hidden inside an XDR library, or an ASN.1 library etc. however it's still there. -p -- "Unix is user friendly, it's just picky about who its friends are." - Anonymous --------------------------------------------------------------------
From: Nick Maclaren on 8 Apr 2008 03:58 In article <nlm*odR-r(a)news.chiark.greenend.org.uk>, Paul Gotch <paulg(a)at-cantab-dot.net> writes: |> Duane Rettig <duane(a)franz.com> wrote: |> > > struct X { |> > > int x : 16; |> > > long long y : 33; |> > > long long z : 15; |> > > } |> |> > Ah, so you're giving me non-portable code to prove non-portability. I |> > see. |> |> That code is standard C99. however how it's represented is different |> depending on how the compiler chooses to implement bitfields. Just as a C++ |> compiler can choose the length of enums to fit the declared values with in |> them. I said that I was going to drop out, but you are a new poster on this thread, and I can't let that pass :-( No, it isn't, not even remotely. Even in its syntax alone, there are two cases, where its very validity depends on implementation-defined features (see 6.7.2 #5, 6.7.2.1 #4, #9 and others). In one case, the above code could quietly give wrong answers if its assumption was false - in the other, a compiler message is more-or-less required. Duane's remarks about its representation considerably understate the case - there was considerable debate on the SC22WG14 reflector about bit-fields, and the decision (I won't say consensus) was to specify their syntax and leave almost all other aspects implementation-defined or unspecified. In particular, the standard is VERY clear that the order of storing bits in bit-fields is completely unspecified - there is nothing stopping a compiler from using a completely different convention for 'int' bit-fields and plain 'int'. Someone stated that in so many words, too. Incidentally, as I read the standard, it is unspecified - not even implementation-defined - whether the bits in the 'int' and first 'long long' will share a storage unit. There's more, but I should need to explain the interaction with other parts of C99 to explain the situation, so I shall stop.
From: Wilco Dijkstra on 8 Apr 2008 06:38
"Duane Rettig" <duane(a)franz.com> wrote in message news:o0bq4ltf71.fsf(a)gemini.franz.com... > "Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes: > >> "Duane Rettig" <duane(a)franz.com> wrote in message news:o0iqyttzes.fsf(a)gemini.franz.com... >>> "Wilco Dijkstra" <Wilco_dot_Dijkstra(a)ntlworld.com> writes: >> struct X { >> int x : 16; >> long long y : 33; >> long long z : 15; >> } > > Ah, so you're giving me non-portable code to prove non-portability. I > see. That shows how little you know about writing portable code and the C standard... This code is 100% portable to any compiler that implements the C99 standard. In fact even most non-C99 compilers can compile this code flawlessly - long long was a defacto standard well before C99. > Well of course every type has a size. That's also not at issue. The > question is if the size matters, and whether the abstraction at the > source level can be made portable. For example, we use a typedef > called "nat", which is meant to be the natural word size of the > program being run. OK, so you do exactly the same as everybody else: using typedefs for types rather than using the standard ones. So I think we all agree the C builtin types are useless and non-portable. The models you mentioned have the same sizes for most types apart from pointers (and sometimes long), so it's hard to argue they are evil. You may not rely on their exact width but you still require them to be a specific minimum width. >> So you add some more specific intermediate instructions and semantics. Do this >> for several languages and targets, and you either end up with a huge intermediate >> language > > but "huge" is so relative... > >>> or complex intermediate instructions whose semantics varies with the context. > > or if you spend the time to refactor the intermediate language, you > start refining it to the point where it is able to take on more > targets with less added complexity. It's a nice ideal indeed, but has anyone ever succeeded in doing it? You can spend a lot of time on refactoring, but there will always be languages that still don't fit in... >> The current compiler I'm working on supports 3 different exception models in the >> intermediate language, and that's just 3 languages so far... To make matters worse, >> each target encodes exception tables different enough that very little can be shared. > > Perhaps you're just not abstracting your intermediate model deeply enough. It has been abstracted as much as feasible. The problem is the semantics of various exception models are so different you can't find a common high-level abstraction that supports all. Of course at the lowest level you can represent exceptions as special control flow edges in a CFG, but at that point you've lost most high-level info - and thus the ability to optimise. Wilco |