From: robertwessel2 on
On Dec 5, 6:00 pm, NathanCBa...(a)gmail.com wrote:
> On Dec 5, 6:39 pm, "Alexei A. Frounze" <alexfrun...(a)gmail.com> wrote:
>
>
>
> > Or one could implement C in such a way that chars and ints are of the
> > machine word size (>= 16 bits). That way the pointers to int and char
> > don't have to be of different size. Example: the compiler for TI's
> > TMS320C54xx series.
>
> Do you happen to know if the booleans were implemented as packed or
> unpacked?


The TI DSP C compilers implement C89 (and not C++ or C99), and so
don't define a boolean type. If you meant bit fields, they do pack
(as size allows), into the 16 bit ints.
From: Alexei A. Frounze on
On Dec 6, 3:00 am, NathanCBa...(a)gmail.com wrote:
> On Dec 5, 6:39 pm, "Alexei A. Frounze" <alexfrun...(a)gmail.com> wrote:
>
>
>
> > Or one could implement C in such a way that chars and ints are of the
> > machine word size (>= 16 bits). That way the pointers to int and char
> > don't have to be of different size. Example: the compiler for TI's
> > TMS320C54xx series.
>
> Do you happen to know if the booleans were implemented as packed or
> unpacked?

I have no idea, never used them in C, since they appeared somewhat
late in the game and were easy to implement in a variety of ways. I'm
not even sure if the compiler supported them.

Alex
From: Rod Pemberton on
<robertwessel2(a)yahoo.com> wrote in message
news:3361d292-73b2-4dd0-99a5-2bbc7d74dbd7(a)x38g2000yqj.googlegroups.com...
On Dec 5, 6:29 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
> <robertwess...(a)yahoo.com> wrote in message
>
> news:00b99e16-304e-4ecc-a6e2-d193025dd4de(a)q9g2000yqc.googlegroups.com...
> On Dec 4, 4:00 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
>

I decided to reorganize the conversation slightly so that I could make my
points clearer.

> Quoting from the C99 standard:
>
> "3.6: byte: addressable unit of data storage large enough to hold any
> member of the basic character set of the execution environment.
>

Previously:

RP> > > > IIRC, the C
> > > > standard requires a "C char" must be large enough to represent the
> > > > entire
> > > > character set. It must be at least the size of the minimum
addressable

Ok, I admit I switched around a "C char" and "C byte".... I got it correct
in one of our past discussions. Like I said, "from memory"...

> "3.7.1: character - single-byte character <C> bit representation that
> fits in a byte"

Previously:

RW> > > A C char and byte are the same size,
> > >
RP> > False.

It seems you're still wrong here. 3.7.1's "fits in" is clearly different
than "same size". "fits in" clearly indicates one can be larger than the
other.

Given

> "3.6: byte: addressable unit of data storage large enough to hold any
> member of the basic character set of the execution environment.

together with

> "3.7.1: character - single-byte character <C> bit representation that
> fits in a byte"

Previously:

RW> > > There is no smaller unit of addressable storage in C than
> > > a char, and chars must be at least 8 bits,
> > >
RP> > True.

Oh my! It seems we're *BOTH* wrong here. (Where's Phil when you need him?)
Those two C99 quotes clearly indicate that a "byte" is the smallest
addressable unit of storage in C, not a char...

> IOW, as far as C is concerned, the thing called a byte and a char are
> essentially the same thing.

False. You just quoted C99 above! It said a char must "fit in" a byte.
I.e., a byte can be larger than a char. It said the byte is the smallest
addressable unit from C's perspective. I admit I got them reversed, but you
didn't grasp what you quoted!

> That is reinforced in a number of other places in either version of
> the standard. For example in 6.2.6.1 (C99):
>
> "Values stored in non-bit-field objects of any other object type
> consist of n * CHAR_BIT bits, where n is the size of an object of that
> type, in bytes. The value may be copied into an object of type
> unsigned char [n] (e.g., by memcpy); the resulting set of bytes is
> called the object representation of the value."
>
> Which clearly requires the equivalence of bytes and chars.

As I see it, no. They're only partially equivalent in one direction. I.e.,
bytes must be greater or equal to chars in size. If a byte is 9-bits, a
char can be 8-bits. I.e., a char is not equivalent to a byte. But, a byte
can represent a char and then some.

> It clearly
> says that N bytes can be stored in N (unsigned) chars.

Reversed? I think that says N (whatever) chars fit in N bytes. Doesn't it?

I.e., if bytes are 9-bits, if chars are 8-bits, if CHAR_BIT is 8, if the
object is 5 bytes in size, then the object is n*CHAR_BIT bits or 5*8-bits.
And, the object "fits in" n*bytes or 5*9-bits. I.e., the upper bit of the
byte is "dead space" or there is 1-bit of padding between chars.

> 5.2.4.2.1 of C99 ("Sizes of integer type" ) says in the definition of
> CHAR_BIT, "number of bits for smallest object that is not a bit-field
> (byte)". And further specified that CHAR_BIT be at least 8.
>
> Footnote 40 of 6.2.6.1 (C99): "A byte contains CHAR_BIT bits." Which
> happens to be exactly the same number of bit a char contains.
>
> There are numerous other such statements.

I haven't looked at those. But, I'd think most of these are likely
"incorrect" from the abstraction of C from mostly 8-bit architectures that
was done for C89. Or, it's "understood" to currently be clarified by 3.6
and 3.7.1.

> There is no smaller unit of addressability in C than a char. Which is
> the same as a byte.

False. You just quoted C99 above! It said a char must fit in a byte.
I.e., a byte can be larger than a char. It said the byte is the smallest
addressable unit from C's perspective. I admit I got them reversed, but you
didn't grasp what you quoted!

> Your statement "if the smallest native addressable unit is 4-bits,
> that's a C byte. And a C char must be at least 8-bits, therefore it's
> at least two C bytes." is flatly wrong. There is not, without a non-
> standard extension, any addressability to anything smaller than a
> char. And C bytes may not be 4 bits. There is no type "byte" in C,
> it exists in the C mostly to distinguish the notion of the physically
> stored data in memory from the logical type char.

You're correct. This is all backwards. Think about it...

> That hardware bytes (for lack of a better term for the smallest
> addressable unit of storage) are commonly 8 bits these days is wholly
> irrelevant. Hardware bytes, whatever those may be, are *not*
> addressed by the C standard.

They are partially addressed by the C standard. What do you think
"addressable unit of data storage" really refers to? It refers to the fact
that C's byte, the smallest addressable unit of storage, must map onto the
hardware's addressable unit or units.

> If whatever the hardware likes to treats
> as the minimal addressable unit does not meet the C (and
> implementation) requirement of a byte (or the identical requirements
> for a char), the implementation must manage some mapping.

True.

> And those
> hardware bytes might well be too small *or* too large.

I think I covered that... in reverse.

> A machine with
> four bit hardware bytes wanting to implement 8 bit C bytes, would need
> to deal with the hardware bytes as pairs.

I think I covered that... in reverse.

> A word addressed machine
> with 32 bit words (or hardware bytes), would need to generate code to
> pack and unpack four C chars (again assuming we wanted the
> implementation to have 8 bit C chars), from a single word as needed.

Not necessarily. It could implement chars as 32 bit words or some other
combination larger than 8-bits.

> All of which is irrelevant, except to implementation.

So, why'd you bring it up?

> If you wanted to implement a system with 16 bit C bytes (and thus 16
> bit C chars), on a 8-bit-byte addressed machine, the compiler will
> have to generate code so that all char accesses address a pair of 8-
> bit hardware bytes. And the smallest addressable unit in the C
> program will be that 16 bit C char.
>
> Nor is your assertion that hardware with a 9-bit hardware byte
> requires a 9-bit C byte and char true.

Nowhere did I say that... Reread.

> While that might well make for
> a convenient implementation on the machine, there is no reason that
> the implementation might not expose 8 bit C bytes and chars, and
> synthesize those out of the underlying 9-bit hardware bytes.

True. Haven't we been over this? Either this time or last time? May of
last year...

> It
> could, for example, store 9 C chars in 8 hardware bytes,
> or store one
> C char per hardware byte, and ignore one bit of each hardware byte.
>
> In fact, you might be tempted to do such a thing if you wanted to port
> much existing C code to your 9-bit-byte machine, simply because so
> much code will break if CHAR_BIT is not 8.
....

> Nor is the range of signed and unsigned values that can be stored in a
> C char irrelevant. A char is an integer type, and must meet certain
> requirements. It has the additional requirement of needing to be able
> to store all of the characters in the extended character set

What "extended character set" ? The 3.6 section that you quoted only
supports the "basic character set"... How do you rationalize inserting an
"extended character set" into the discussion if not supported by 3.6?

> for the
> implementation, and that all members of the basic character set be
> positive when stored in a char (which for 8-bit chars either requires
> that all characters in the basic set have values less than 128, or
> that char is unsigned).

Irrelevant... see 3.6


Rod Pemberton


From: Rod Pemberton on
<NathanCBaker(a)gmail.com> wrote in message
news:177efb5b-294a-4a07-aa3e-0027bb43e441(a)z1g2000yqn.googlegroups.com...
On Dec 5, 7:29 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
>
> If so, wouldn't you want "for" and "endfor" to
> have numbers appended so they explicitly match?

Think about that. This works fine:

for1
..for2
...for3
....for4
......
....endfor4
...endfor3
..endfor2
endfor1

Do you see any problem with explicitly labeling which for goes with which
end? If not, what about now?

for1
..for2
...for3
....for4
......
...endfor3
....endfor4
..endfor2
endfor1

What about now?

for
..while
...if
....
...endfor
..endif
endwhile

How does the overlapping blocks, by using explict or semi-explicit
terminators instead of generic ones, affect the code? Is it clear that you
can't guarantee structured code with explicit or semi-explicit terminators?
I.e., even if the braces are wrongly located in C, structured code is the
result.

> I am still not convinced that C is an assembly language.

Who said it was?

> So, you are saying that C is a poorly designed language?

How do you come to that conclusion?

> > Oh, I'm sure they are known - just not by me - which was why I asked.
> > Hopefully, they are known by you since you wrote the program. But, you
are
> > avoiding answering these basic questions. They should definately be
known
> > by Randall.
> >
>
> I answered your questions in an earlier post. What part of it do you
> want me to clarify?

You answered some questions. But, not those I was interested in.


Rod Pemberton


From: robertwessel2 on
On Dec 6, 3:36 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
> <robertwess...(a)yahoo.com> wrote in message
>
> news:3361d292-73b2-4dd0-99a5-2bbc7d74dbd7(a)x38g2000yqj.googlegroups.com...
> On Dec 5, 6:29 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
>
> > <robertwess...(a)yahoo.com> wrote in message
>
> >news:00b99e16-304e-4ecc-a6e2-d193025dd4de(a)q9g2000yqc.googlegroups.com...
> > On Dec 4, 4:00 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
>
> I decided to reorganize the conversation slightly so that I could make my
> points clearer.
>
> > Quoting from the C99 standard:
>
> > "3.6: byte: addressable unit of data storage large enough to hold any
> > member of the basic character set of the execution environment.
>
> Previously:
>
> RP> > > > IIRC, the C> > > > standard requires a "C char" must be large enough to represent the
> > > > > entire
> > > > > character set. It must be at least the size of the minimum
>
> addressable
>
> Ok, I admit I switched around a "C char" and "C byte"....  I got it correct
> in one of our past discussions.  Like I said, "from memory"...
>
> > "3.7.1: character - single-byte character <C> bit representation that
> > fits in a byte"
>
> Previously:
>
> RW> > > A C char and byte are the same size,
>
> RP> > False.
>
> It seems you're still wrong here.  3.7.1's "fits in" is clearly different
> than "same size".  "fits in" clearly indicates one can be larger than the
> other.
>
> Given
>
> > "3.6: byte: addressable unit of data storage large enough to hold any
> > member of the basic character set of the execution environment.
>
> together with
>
> > "3.7.1: character - single-byte character <C> bit representation that
> > fits in a byte"
>
> Previously:
>
> RW> > > There is no smaller unit of addressable storage in C than> > > a char, and chars must be at least 8 bits,
>
> RP> > True.
>
> Oh my!  It seems we're *BOTH* wrong here.  (Where's Phil when you need him?)
> Those two C99 quotes clearly indicate that a "byte" is the smallest
> addressable unit of storage in C, not a char...
>
> > IOW, as far as C is concerned, the thing called a byte and a char are
> > essentially the same thing.
>
> False.  You just quoted C99 above!  It said a char must "fit in" a byte.
> I.e., a byte can be larger than a char.  It said the byte is the smallest
> addressable unit from C's perspective.  I admit I got them reversed, but you
> didn't grasp what you quoted!


No, see below.


> > That is reinforced in a number of other places in either version of
> > the standard.  For example in 6.2.6.1 (C99):
>
> > "Values stored in non-bit-field objects of any other object type
> > consist of n * CHAR_BIT bits, where n is the size of an object of that
> > type, in bytes. The value may be copied into an object of type
> > unsigned char [n] (e.g., by memcpy); the resulting set of bytes is
> > called the object representation of the value."
>
> > Which clearly requires the equivalence of bytes and chars.
>
> As I see it, no.  They're only partially equivalent in one direction.  I.e.,
> bytes must be greater or equal to chars in size.  If a byte is 9-bits, a
> char can be 8-bits.  I.e., a char is not equivalent to a byte.  But, a byte
> can represent a char and then some.
>
> > It clearly
> > says that N bytes can be stored in N (unsigned) chars.
>
> Reversed?  I think that says N (whatever) chars fit in N bytes.  Doesn't it?


No, it says you can store an object of N bytes in an array of N
chars. So a byte must fit in a char. And you've acknowledged that a
char must fit in a byte.


> > 5.2.4.2.1 of C99 ("Sizes of integer type" ) says in the definition of
> > CHAR_BIT, "number of bits for smallest object that is not a bit-field
> > (byte)".  And further specified that CHAR_BIT be at least 8.
>
> > Footnote 40 of 6.2.6.1 (C99): "A byte contains CHAR_BIT bits."  Which
> > happens to be exactly the same number of bit a char contains.
>
> > There are numerous other such statements.
>
> I haven't looked at those.  But, I'd think most of these are likely
> "incorrect" from the abstraction of C from mostly 8-bit architectures that
> was done for C89.  Or, it's "understood" to currently be clarified by 3..6
> and 3.7.1.


How is the statement, taken directly from the standard, that a byte
contains CHAR_BIT bits in any way related to eight bit
implementations, or in any way ambiguous as to the exact size of a C
byte (IOW, it's CHAR_BITS)?

And to further quote footnote 40:

"A byte contains CHAR_BIT bits, and the values of type unsigned char
range from 0 to 2**CHAR_BIT - 1." So a byte contains CHAR_BIT bits.
And the numbers that you can put in an unsigned char exactly
correspond to that. It's not "unsigned char ranges from zero to no
more than (2**CHAR_BIT - 1)" - rather the range is exact. So a byte
contains exactly the number of bits that can fit in an unsigned char.
And an unsigned char can hold exactly the number of different values
that can fit in a byte.


> > There is no smaller unit of addressability in C than a char.  Which is
> > the same as a byte.
>
> False.  You just quoted C99 above!  It said a char must fit in a byte..
> I.e., a byte can be larger than a char.  It said the byte is the smallest
> addressable unit from C's perspective.  I admit I got them reversed, but you
> didn't grasp what you quoted!


A byte fits in a char, and a char fits in a byte. If you can find
wiggle room for different sizes in there, you're cleverer than I am.


> > Your statement "if the smallest native addressable unit is 4-bits,
> > that's a C byte.  And a C char must be at least 8-bits, therefore it's
> > at least two C bytes."  is flatly wrong.  There is not, without a non-
> > standard extension, any addressability to anything smaller than a
> > char.  And C bytes may not be 4 bits.  There is no type "byte" in C,
> > it exists in the C mostly to distinguish the notion of the physically
> > stored data in memory from the logical type char.
>
> You're correct.  This is all backwards.  Think about it...
>
> > That hardware bytes (for lack of a better term for the smallest
> > addressable unit of storage) are commonly 8 bits these days is wholly
> > irrelevant.  Hardware bytes, whatever those may be, are *not*
> > addressed by the C standard.
>
> They are partially addressed by the C standard.  What do you think
> "addressable unit of data storage" really refers to?  It refers to the fact
> that C's byte, the smallest addressable unit of storage, must map onto the
> hardware's addressable unit or units.


I have no clue what you're trying to say here. Obviously a C char or
byte must eventually by stored in real memory, presumably in whatever
physically addressable units that the hardware actually provides (the
"hardware byte" under discussion). The C standard continues to impose
no required relationship between the hardware byte and the C byte/
char. Most real implementations will, of course, attempt a mapping
between the two that is simple and efficient (eg. a C char/byte is
implemented as a conventional 8-bit hardware byte), unless there is
some really compelling reason to do otherwise.


> > (...)
>> A word addressed machine
>> with 32 bit words (or hardware bytes), would need to generate code to
>> pack and unpack four C chars (again assuming we wanted the
>> implementation to have 8 bit C chars), from a single word as needed.
>
>Not necessarily. It could implement chars as 32 bit words or some other
>combination larger than 8-bits.


What part of "assuming we wanted the implementation to have 8 bit C
chars" did you miss in the above?


> > All of which is irrelevant, except to implementation.
>
> So, why'd you bring it up?


Because you did, by appearing to conflate hardware bytes and C bytes.


> > If you wanted to implement a system with 16 bit C bytes (and thus 16
> > bit C chars), on a 8-bit-byte addressed machine, the compiler will
> > have to generate code so that all char accesses address a pair of 8-
> > bit hardware bytes.  And the smallest addressable unit in the C
> > program will be that 16 bit C char.
>
> > Nor is your assertion that hardware with a 9-bit hardware byte
> > requires a 9-bit C byte and char true.
>
> Nowhere did I say that...  Reread.


"If the smallest native addressable unit is 9-bits, that's a C byte"
appears to refer to hardware bytes, both in isolation and in context.
If that's not what you meant, then my comment was superfluous.


> > While that might well make for
> > a convenient implementation on the machine, there is no reason that
> > the implementation might not expose 8 bit C bytes and chars, and
> > synthesize those out of the underlying 9-bit hardware bytes.
>
> True.  Haven't we been over this?  Either this time or last time?  May of
> last year...


Yes. And you basically refused to acknowledge that the C standard is
not described in terms of real hardware, and went away in a huff...

FWIW: RP: "This, of course, is due to his continued belief in the pure
abstraction of C from the underlying hardware and assembly: a
fallacy."

Any given implementation of course relates the two, but the C standard
itself does not. Also obviously hardware with odd parameters may make
C difficult to implement in various ways, and also clearly one reason
that C is broadly popular is that most hardware does *not* produce
significant difficulties for a C implementer. The reverse is true as
well, it's hard to image a modern hardware designer not taking ease of
C implementation into account when designing an architecture. Nor
does a C program need to be aware of any of those implementation
details (although writing such strictly conforming and totally
portable C code can be difficult, and is unusual in practice).


> What "extended character set" ?  The 3.6 section that you quoted only
> supports the "basic character set"...  How do you rationalize inserting an
> "extended character set" into the discussion if not supported by 3.6?


Character set are defined in 5.2 (C99). The basic set include the
upper and lower case letters, digits, 29 punctuation marks, space and
several control characters. The extended set also includes all other
characters an implementation provides. For example, on ASCII
implementations, the at-sign is a character you'd find the extended
set, but not in the basic. Extended characters that are not multi-
byte characters (I omitted the "not multi-byte" condition in my first
post), need to fit in a byte/char, but are not required to be positive
values in a char.



The bottom line is this: The C standard uses the terms byte and char
essentially synonymously, and further must appear to be the same size
from a C program's perspective. A byte is the unit of storage, the
type is a char. A char must fit in a byte, and a byte must fit in a
char (ignoring issues of signedness for the moment)...

And I want to mention that I quote from the C99 standard more often
only because I have that in electronic form, and only hardcopies of
C89, which makes for less typing...
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13
Prev: Win32 non blocking console input?
Next: hugi compo #29