Prev: Win32 non blocking console input?
Next: hugi compo #29
From: robertwessel2 on 5 Dec 2008 19:24 On Dec 5, 6:00 pm, NathanCBa...(a)gmail.com wrote: > On Dec 5, 6:39 pm, "Alexei A. Frounze" <alexfrun...(a)gmail.com> wrote: > > > > > Or one could implement C in such a way that chars and ints are of the > > machine word size (>= 16 bits). That way the pointers to int and char > > don't have to be of different size. Example: the compiler for TI's > > TMS320C54xx series. > > Do you happen to know if the booleans were implemented as packed or > unpacked? The TI DSP C compilers implement C89 (and not C++ or C99), and so don't define a boolean type. If you meant bit fields, they do pack (as size allows), into the 16 bit ints.
From: Alexei A. Frounze on 5 Dec 2008 20:39 On Dec 6, 3:00 am, NathanCBa...(a)gmail.com wrote: > On Dec 5, 6:39 pm, "Alexei A. Frounze" <alexfrun...(a)gmail.com> wrote: > > > > > Or one could implement C in such a way that chars and ints are of the > > machine word size (>= 16 bits). That way the pointers to int and char > > don't have to be of different size. Example: the compiler for TI's > > TMS320C54xx series. > > Do you happen to know if the booleans were implemented as packed or > unpacked? I have no idea, never used them in C, since they appeared somewhat late in the game and were easy to implement in a variety of ways. I'm not even sure if the compiler supported them. Alex
From: Rod Pemberton on 6 Dec 2008 04:36 <robertwessel2(a)yahoo.com> wrote in message news:3361d292-73b2-4dd0-99a5-2bbc7d74dbd7(a)x38g2000yqj.googlegroups.com... On Dec 5, 6:29 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote: > <robertwess...(a)yahoo.com> wrote in message > > news:00b99e16-304e-4ecc-a6e2-d193025dd4de(a)q9g2000yqc.googlegroups.com... > On Dec 4, 4:00 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote: > I decided to reorganize the conversation slightly so that I could make my points clearer. > Quoting from the C99 standard: > > "3.6: byte: addressable unit of data storage large enough to hold any > member of the basic character set of the execution environment. > Previously: RP> > > > IIRC, the C > > > > standard requires a "C char" must be large enough to represent the > > > > entire > > > > character set. It must be at least the size of the minimum addressable Ok, I admit I switched around a "C char" and "C byte".... I got it correct in one of our past discussions. Like I said, "from memory"... > "3.7.1: character - single-byte character <C> bit representation that > fits in a byte" Previously: RW> > > A C char and byte are the same size, > > > RP> > False. It seems you're still wrong here. 3.7.1's "fits in" is clearly different than "same size". "fits in" clearly indicates one can be larger than the other. Given > "3.6: byte: addressable unit of data storage large enough to hold any > member of the basic character set of the execution environment. together with > "3.7.1: character - single-byte character <C> bit representation that > fits in a byte" Previously: RW> > > There is no smaller unit of addressable storage in C than > > > a char, and chars must be at least 8 bits, > > > RP> > True. Oh my! It seems we're *BOTH* wrong here. (Where's Phil when you need him?) Those two C99 quotes clearly indicate that a "byte" is the smallest addressable unit of storage in C, not a char... > IOW, as far as C is concerned, the thing called a byte and a char are > essentially the same thing. False. You just quoted C99 above! It said a char must "fit in" a byte. I.e., a byte can be larger than a char. It said the byte is the smallest addressable unit from C's perspective. I admit I got them reversed, but you didn't grasp what you quoted! > That is reinforced in a number of other places in either version of > the standard. For example in 6.2.6.1 (C99): > > "Values stored in non-bit-field objects of any other object type > consist of n * CHAR_BIT bits, where n is the size of an object of that > type, in bytes. The value may be copied into an object of type > unsigned char [n] (e.g., by memcpy); the resulting set of bytes is > called the object representation of the value." > > Which clearly requires the equivalence of bytes and chars. As I see it, no. They're only partially equivalent in one direction. I.e., bytes must be greater or equal to chars in size. If a byte is 9-bits, a char can be 8-bits. I.e., a char is not equivalent to a byte. But, a byte can represent a char and then some. > It clearly > says that N bytes can be stored in N (unsigned) chars. Reversed? I think that says N (whatever) chars fit in N bytes. Doesn't it? I.e., if bytes are 9-bits, if chars are 8-bits, if CHAR_BIT is 8, if the object is 5 bytes in size, then the object is n*CHAR_BIT bits or 5*8-bits. And, the object "fits in" n*bytes or 5*9-bits. I.e., the upper bit of the byte is "dead space" or there is 1-bit of padding between chars. > 5.2.4.2.1 of C99 ("Sizes of integer type" ) says in the definition of > CHAR_BIT, "number of bits for smallest object that is not a bit-field > (byte)". And further specified that CHAR_BIT be at least 8. > > Footnote 40 of 6.2.6.1 (C99): "A byte contains CHAR_BIT bits." Which > happens to be exactly the same number of bit a char contains. > > There are numerous other such statements. I haven't looked at those. But, I'd think most of these are likely "incorrect" from the abstraction of C from mostly 8-bit architectures that was done for C89. Or, it's "understood" to currently be clarified by 3.6 and 3.7.1. > There is no smaller unit of addressability in C than a char. Which is > the same as a byte. False. You just quoted C99 above! It said a char must fit in a byte. I.e., a byte can be larger than a char. It said the byte is the smallest addressable unit from C's perspective. I admit I got them reversed, but you didn't grasp what you quoted! > Your statement "if the smallest native addressable unit is 4-bits, > that's a C byte. And a C char must be at least 8-bits, therefore it's > at least two C bytes." is flatly wrong. There is not, without a non- > standard extension, any addressability to anything smaller than a > char. And C bytes may not be 4 bits. There is no type "byte" in C, > it exists in the C mostly to distinguish the notion of the physically > stored data in memory from the logical type char. You're correct. This is all backwards. Think about it... > That hardware bytes (for lack of a better term for the smallest > addressable unit of storage) are commonly 8 bits these days is wholly > irrelevant. Hardware bytes, whatever those may be, are *not* > addressed by the C standard. They are partially addressed by the C standard. What do you think "addressable unit of data storage" really refers to? It refers to the fact that C's byte, the smallest addressable unit of storage, must map onto the hardware's addressable unit or units. > If whatever the hardware likes to treats > as the minimal addressable unit does not meet the C (and > implementation) requirement of a byte (or the identical requirements > for a char), the implementation must manage some mapping. True. > And those > hardware bytes might well be too small *or* too large. I think I covered that... in reverse. > A machine with > four bit hardware bytes wanting to implement 8 bit C bytes, would need > to deal with the hardware bytes as pairs. I think I covered that... in reverse. > A word addressed machine > with 32 bit words (or hardware bytes), would need to generate code to > pack and unpack four C chars (again assuming we wanted the > implementation to have 8 bit C chars), from a single word as needed. Not necessarily. It could implement chars as 32 bit words or some other combination larger than 8-bits. > All of which is irrelevant, except to implementation. So, why'd you bring it up? > If you wanted to implement a system with 16 bit C bytes (and thus 16 > bit C chars), on a 8-bit-byte addressed machine, the compiler will > have to generate code so that all char accesses address a pair of 8- > bit hardware bytes. And the smallest addressable unit in the C > program will be that 16 bit C char. > > Nor is your assertion that hardware with a 9-bit hardware byte > requires a 9-bit C byte and char true. Nowhere did I say that... Reread. > While that might well make for > a convenient implementation on the machine, there is no reason that > the implementation might not expose 8 bit C bytes and chars, and > synthesize those out of the underlying 9-bit hardware bytes. True. Haven't we been over this? Either this time or last time? May of last year... > It > could, for example, store 9 C chars in 8 hardware bytes, > or store one > C char per hardware byte, and ignore one bit of each hardware byte. > > In fact, you might be tempted to do such a thing if you wanted to port > much existing C code to your 9-bit-byte machine, simply because so > much code will break if CHAR_BIT is not 8. .... > Nor is the range of signed and unsigned values that can be stored in a > C char irrelevant. A char is an integer type, and must meet certain > requirements. It has the additional requirement of needing to be able > to store all of the characters in the extended character set What "extended character set" ? The 3.6 section that you quoted only supports the "basic character set"... How do you rationalize inserting an "extended character set" into the discussion if not supported by 3.6? > for the > implementation, and that all members of the basic character set be > positive when stored in a char (which for 8-bit chars either requires > that all characters in the basic set have values less than 128, or > that char is unsigned). Irrelevant... see 3.6 Rod Pemberton
From: Rod Pemberton on 6 Dec 2008 05:17 <NathanCBaker(a)gmail.com> wrote in message news:177efb5b-294a-4a07-aa3e-0027bb43e441(a)z1g2000yqn.googlegroups.com... On Dec 5, 7:29 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote: > > If so, wouldn't you want "for" and "endfor" to > have numbers appended so they explicitly match? Think about that. This works fine: for1 ..for2 ...for3 ....for4 ...... ....endfor4 ...endfor3 ..endfor2 endfor1 Do you see any problem with explicitly labeling which for goes with which end? If not, what about now? for1 ..for2 ...for3 ....for4 ...... ...endfor3 ....endfor4 ..endfor2 endfor1 What about now? for ..while ...if .... ...endfor ..endif endwhile How does the overlapping blocks, by using explict or semi-explicit terminators instead of generic ones, affect the code? Is it clear that you can't guarantee structured code with explicit or semi-explicit terminators? I.e., even if the braces are wrongly located in C, structured code is the result. > I am still not convinced that C is an assembly language. Who said it was? > So, you are saying that C is a poorly designed language? How do you come to that conclusion? > > Oh, I'm sure they are known - just not by me - which was why I asked. > > Hopefully, they are known by you since you wrote the program. But, you are > > avoiding answering these basic questions. They should definately be known > > by Randall. > > > > I answered your questions in an earlier post. What part of it do you > want me to clarify? You answered some questions. But, not those I was interested in. Rod Pemberton
From: robertwessel2 on 6 Dec 2008 02:52
On Dec 6, 3:36 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote: > <robertwess...(a)yahoo.com> wrote in message > > news:3361d292-73b2-4dd0-99a5-2bbc7d74dbd7(a)x38g2000yqj.googlegroups.com... > On Dec 5, 6:29 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote: > > > <robertwess...(a)yahoo.com> wrote in message > > >news:00b99e16-304e-4ecc-a6e2-d193025dd4de(a)q9g2000yqc.googlegroups.com... > > On Dec 4, 4:00 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote: > > I decided to reorganize the conversation slightly so that I could make my > points clearer. > > > Quoting from the C99 standard: > > > "3.6: byte: addressable unit of data storage large enough to hold any > > member of the basic character set of the execution environment. > > Previously: > > RP> > > > IIRC, the C> > > > standard requires a "C char" must be large enough to represent the > > > > > entire > > > > > character set. It must be at least the size of the minimum > > addressable > > Ok, I admit I switched around a "C char" and "C byte".... I got it correct > in one of our past discussions. Like I said, "from memory"... > > > "3.7.1: character - single-byte character <C> bit representation that > > fits in a byte" > > Previously: > > RW> > > A C char and byte are the same size, > > RP> > False. > > It seems you're still wrong here. 3.7.1's "fits in" is clearly different > than "same size". "fits in" clearly indicates one can be larger than the > other. > > Given > > > "3.6: byte: addressable unit of data storage large enough to hold any > > member of the basic character set of the execution environment. > > together with > > > "3.7.1: character - single-byte character <C> bit representation that > > fits in a byte" > > Previously: > > RW> > > There is no smaller unit of addressable storage in C than> > > a char, and chars must be at least 8 bits, > > RP> > True. > > Oh my! It seems we're *BOTH* wrong here. (Where's Phil when you need him?) > Those two C99 quotes clearly indicate that a "byte" is the smallest > addressable unit of storage in C, not a char... > > > IOW, as far as C is concerned, the thing called a byte and a char are > > essentially the same thing. > > False. You just quoted C99 above! It said a char must "fit in" a byte. > I.e., a byte can be larger than a char. It said the byte is the smallest > addressable unit from C's perspective. I admit I got them reversed, but you > didn't grasp what you quoted! No, see below. > > That is reinforced in a number of other places in either version of > > the standard. For example in 6.2.6.1 (C99): > > > "Values stored in non-bit-field objects of any other object type > > consist of n * CHAR_BIT bits, where n is the size of an object of that > > type, in bytes. The value may be copied into an object of type > > unsigned char [n] (e.g., by memcpy); the resulting set of bytes is > > called the object representation of the value." > > > Which clearly requires the equivalence of bytes and chars. > > As I see it, no. They're only partially equivalent in one direction. I.e., > bytes must be greater or equal to chars in size. If a byte is 9-bits, a > char can be 8-bits. I.e., a char is not equivalent to a byte. But, a byte > can represent a char and then some. > > > It clearly > > says that N bytes can be stored in N (unsigned) chars. > > Reversed? I think that says N (whatever) chars fit in N bytes. Doesn't it? No, it says you can store an object of N bytes in an array of N chars. So a byte must fit in a char. And you've acknowledged that a char must fit in a byte. > > 5.2.4.2.1 of C99 ("Sizes of integer type" ) says in the definition of > > CHAR_BIT, "number of bits for smallest object that is not a bit-field > > (byte)". And further specified that CHAR_BIT be at least 8. > > > Footnote 40 of 6.2.6.1 (C99): "A byte contains CHAR_BIT bits." Which > > happens to be exactly the same number of bit a char contains. > > > There are numerous other such statements. > > I haven't looked at those. But, I'd think most of these are likely > "incorrect" from the abstraction of C from mostly 8-bit architectures that > was done for C89. Or, it's "understood" to currently be clarified by 3..6 > and 3.7.1. How is the statement, taken directly from the standard, that a byte contains CHAR_BIT bits in any way related to eight bit implementations, or in any way ambiguous as to the exact size of a C byte (IOW, it's CHAR_BITS)? And to further quote footnote 40: "A byte contains CHAR_BIT bits, and the values of type unsigned char range from 0 to 2**CHAR_BIT - 1." So a byte contains CHAR_BIT bits. And the numbers that you can put in an unsigned char exactly correspond to that. It's not "unsigned char ranges from zero to no more than (2**CHAR_BIT - 1)" - rather the range is exact. So a byte contains exactly the number of bits that can fit in an unsigned char. And an unsigned char can hold exactly the number of different values that can fit in a byte. > > There is no smaller unit of addressability in C than a char. Which is > > the same as a byte. > > False. You just quoted C99 above! It said a char must fit in a byte.. > I.e., a byte can be larger than a char. It said the byte is the smallest > addressable unit from C's perspective. I admit I got them reversed, but you > didn't grasp what you quoted! A byte fits in a char, and a char fits in a byte. If you can find wiggle room for different sizes in there, you're cleverer than I am. > > Your statement "if the smallest native addressable unit is 4-bits, > > that's a C byte. And a C char must be at least 8-bits, therefore it's > > at least two C bytes." is flatly wrong. There is not, without a non- > > standard extension, any addressability to anything smaller than a > > char. And C bytes may not be 4 bits. There is no type "byte" in C, > > it exists in the C mostly to distinguish the notion of the physically > > stored data in memory from the logical type char. > > You're correct. This is all backwards. Think about it... > > > That hardware bytes (for lack of a better term for the smallest > > addressable unit of storage) are commonly 8 bits these days is wholly > > irrelevant. Hardware bytes, whatever those may be, are *not* > > addressed by the C standard. > > They are partially addressed by the C standard. What do you think > "addressable unit of data storage" really refers to? It refers to the fact > that C's byte, the smallest addressable unit of storage, must map onto the > hardware's addressable unit or units. I have no clue what you're trying to say here. Obviously a C char or byte must eventually by stored in real memory, presumably in whatever physically addressable units that the hardware actually provides (the "hardware byte" under discussion). The C standard continues to impose no required relationship between the hardware byte and the C byte/ char. Most real implementations will, of course, attempt a mapping between the two that is simple and efficient (eg. a C char/byte is implemented as a conventional 8-bit hardware byte), unless there is some really compelling reason to do otherwise. > > (...) >> A word addressed machine >> with 32 bit words (or hardware bytes), would need to generate code to >> pack and unpack four C chars (again assuming we wanted the >> implementation to have 8 bit C chars), from a single word as needed. > >Not necessarily. It could implement chars as 32 bit words or some other >combination larger than 8-bits. What part of "assuming we wanted the implementation to have 8 bit C chars" did you miss in the above? > > All of which is irrelevant, except to implementation. > > So, why'd you bring it up? Because you did, by appearing to conflate hardware bytes and C bytes. > > If you wanted to implement a system with 16 bit C bytes (and thus 16 > > bit C chars), on a 8-bit-byte addressed machine, the compiler will > > have to generate code so that all char accesses address a pair of 8- > > bit hardware bytes. And the smallest addressable unit in the C > > program will be that 16 bit C char. > > > Nor is your assertion that hardware with a 9-bit hardware byte > > requires a 9-bit C byte and char true. > > Nowhere did I say that... Reread. "If the smallest native addressable unit is 9-bits, that's a C byte" appears to refer to hardware bytes, both in isolation and in context. If that's not what you meant, then my comment was superfluous. > > While that might well make for > > a convenient implementation on the machine, there is no reason that > > the implementation might not expose 8 bit C bytes and chars, and > > synthesize those out of the underlying 9-bit hardware bytes. > > True. Haven't we been over this? Either this time or last time? May of > last year... Yes. And you basically refused to acknowledge that the C standard is not described in terms of real hardware, and went away in a huff... FWIW: RP: "This, of course, is due to his continued belief in the pure abstraction of C from the underlying hardware and assembly: a fallacy." Any given implementation of course relates the two, but the C standard itself does not. Also obviously hardware with odd parameters may make C difficult to implement in various ways, and also clearly one reason that C is broadly popular is that most hardware does *not* produce significant difficulties for a C implementer. The reverse is true as well, it's hard to image a modern hardware designer not taking ease of C implementation into account when designing an architecture. Nor does a C program need to be aware of any of those implementation details (although writing such strictly conforming and totally portable C code can be difficult, and is unusual in practice). > What "extended character set" ? The 3.6 section that you quoted only > supports the "basic character set"... How do you rationalize inserting an > "extended character set" into the discussion if not supported by 3.6? Character set are defined in 5.2 (C99). The basic set include the upper and lower case letters, digits, 29 punctuation marks, space and several control characters. The extended set also includes all other characters an implementation provides. For example, on ASCII implementations, the at-sign is a character you'd find the extended set, but not in the basic. Extended characters that are not multi- byte characters (I omitted the "not multi-byte" condition in my first post), need to fit in a byte/char, but are not required to be positive values in a char. The bottom line is this: The C standard uses the terms byte and char essentially synonymously, and further must appear to be the same size from a C program's perspective. A byte is the unit of storage, the type is a char. A char must fit in a byte, and a byte must fit in a char (ignoring issues of signedness for the moment)... And I want to mention that I quote from the C99 standard more often only because I have that in electronic form, and only hardcopies of C89, which makes for less typing... |