From: robertwessel2 on
On Dec 7, 12:20 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
> <robertwess...(a)yahoo.com> wrote in message
>
> news:8d725647-529b-40ae-a97e-ad6d296e10c4(a)33g2000yqm.googlegroups.com...
> On Dec 6, 3:36 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
>
> > <robertwess...(a)yahoo.com> wrote in message
>
> > > > "Values stored in non-bit-field objects of any other object type
> > > > consist of n * CHAR_BIT bits, where n is the size of an object of that
> > > > type, in bytes. The value may be copied into an object of type
> > > > unsigned char [n] (e.g., by memcpy); the resulting set of bytes is
> > > > called the object representation of the value."
>
> > > > Which clearly requires the equivalence of bytes and chars.
>
> > > As I see it, no. They're only partially equivalent in one direction.
> I.e.,
> > > bytes must be greater or equal to chars in size. If a byte is 9-bits, a
> > > char can be 8-bits. I.e., a char is not equivalent to a byte. But, a
> byte
> > > can represent a char and then some.
>
> > > > It clearly
> > > > says that N bytes can be stored in N (unsigned) chars.
>
> > > Reversed? I think that says N (whatever) chars fit in N bytes. Doesn't
> it?
>
> > No, it says you can store an object of N bytes in an array of N
> > chars.
>
> Where?  Unless you misquoted, it says:
>   1) the object consists of n*CHAR_BITS
>   2) n is the number of bytes needed to build that object
>   3) the resulting set of bytes is the objects representation
>   4) an object comprised of some char's can be copied into a bunch of bytes
>
> Their truth states:
>   1) is true regardless of the size of a byte
>   2) is true as long as a byte is larger than or equal to a char in bits
>   3) is true as long as a byte is larger than or equal to a char in bits
>   4) is true regardless of the size of a byte and is true as long as a byte
> is larger than or equal to a char in bits


Ugh. Your first #4 is plainly wrong. The standard says the N bytes
of an object can be copied into an array N unsigned chars. Those
characters are also stored in bytes, of course.

Given the utterly plain language that you are misinterpreting, I'm
beginning to wonder if you're serious.

Here's the statement broken apart a bit:

"Values stored in non-bit-field objects of any other object type
consist of n * CHAR_BIT bits, where n is the size of an object of that
type, in bytes."

- an object is stored in N bytes, each with CHAR_BIT bits

"The value may be copied into an object of type unsigned char [n]
(e.g., by memcpy);"

- Those N bytes, or at least the values contained therein, can be
copied in to an array of N chars.

"the resulting set of bytes is called the object representation of the
value."

- that array of chars is also a bunch of bytes

And note that memcpy() is defined to move *chars*. So memcpy, which
moves chars, also happens to move the same number of bytes (by the
above).


> > So a byte must fit in a char.  And you've acknowledged that a
> > char must fit in a byte.
>
> Illogical conclusion.  You're basis is based upon you're misunderstanding of
> what was stated.
>
>
>
> > > > 5.2.4.2.1 of C99 ("Sizes of integer type" ) says in the definition of
> > > > CHAR_BIT, "number of bits for smallest object that is not a bit-field
> > > > (byte)". And further specified that CHAR_BIT be at least 8.
>
> > > > Footnote 40 of 6.2.6.1 (C99): "A byte contains CHAR_BIT bits." Which
> > > > happens to be exactly the same number of bit a char contains.
>
> > > > There are numerous other such statements.
>
> > > I haven't looked at those. But, I'd think most of these are likely
> > > "incorrect" from the abstraction of C from mostly 8-bit architectures
> that
> > > was done for C89. Or, it's "understood" to currently be clarified by 3.6
> > > and 3.7.1.
>
> > How is the statement, taken directly from the standard, that a byte
> > contains CHAR_BIT bits in any way related to eight bit
> > implementations, or in any way ambiguous as to the exact size of a C
> > byte (IOW, it's CHAR_BITS)?
>
> Because, they clearly state exactly what I stated at the begining of this
> discussion, which you demonstrated was incorrect, specifically reversed in
> terms of my statement of byte and char.  Therefore, it's only logical to
> assume these are incorrect too and have their incorrectness based upon
> historical abstractions from working versions of C.
>
> > > There is no smaller unit of addressability in C than a char. Which is
> > > the same as a byte.
>
> > False. You just quoted C99 above! It said a char must fit in a byte.
> > I.e., a byte can be larger than a char. It said the byte is the smallest
> > addressable unit from C's perspective. I admit I got them reversed, but
> you
> > didn't grasp what you quoted!
>
> ...


It's completely unclear how you think 3.6 and 3.7.1 contradict each
other.

And while "A fits in B" can mean "B is larger than A" ("the baseball
fits in the shoebox"), it's also perfectly valid to use that form to
mean "B (exactly) fits in A". For example "bolt A fits in nut B."
Despite the fact that a 1/8 inch bolt will actually "fit" into a 1/4
inch nut, the plain meaning is that the bolt *exactly* matches the
nut. While the two sections were talking about may not make it
completely clear which of the two forms of “fit” they mean, numerous
other places in the standard do.

Nice selective quoting, BTW. Why not at least address the even
plainer extended quote from footnote 40:

"A byte contains CHAR_BIT bits, and the values of type unsigned char
range from 0 to 2**CHAR_BIT - 1." So a byte contains CHAR_BIT bits.
And the numbers that you can put in an unsigned char exactly
correspond to that. It's not "unsigned char ranges from zero to no
more than (2**CHAR_BIT - 1)" - rather the range is exact. So a byte
contains exactly the number of bits that can fit in an unsigned char.
And an unsigned char can hold exactly the number of different values
that can fit in a byte.


> > A byte fits in a char,
>
> False.
>
> > and a char fits in a byte.
>
> True.
>
> > If you can find
> > wiggle room for different sizes in there, you're cleverer than I am.
>
> There's no wiggle room.  You proved one is False and the other is True by
> quoting 3.6 and 3.7.1.  If the other sections apply as you state, then one
> or both of the definitions for 3.6 and/or 3.7.1 must be False.


I'm utterly baffled.


> > > > Your statement "if the smallest native addressable unit is 4-bits,
> > > > that's a C byte. And a C char must be at least 8-bits, therefore it's
> > > > at least two C bytes." is flatly wrong. There is not, without a non-
> > > > standard extension, any addressability to anything smaller than a
> > > > char. And C bytes may not be 4 bits. There is no type "byte" in C,
> > > > it exists in the C mostly to distinguish the notion of the physically
> > > > stored data in memory from the logical type char.
>
> > > You're correct. This is all backwards. Think about it...
>
> > > > That hardware bytes (for lack of a better term for the smallest
> > > > addressable unit of storage) are commonly 8 bits these days is wholly
> > > > irrelevant. Hardware bytes, whatever those may be, are *not*
> > > > addressed by the C standard.
>
> > > They are partially addressed by the C standard. What do you think
> > > "addressable unit of data storage" really refers to? It refers to the
> fact
> > > that C's byte, the smallest addressable unit of storage, must map onto
> the
> > > hardware's addressable unit or units.
>
> > I have no clue what you're trying to say here.
>
> Yup.  Not to be offensive, but that's part of the problem.


If the reader is baffled, it may be incompetence on either the part of
the reader, or the writer. Or both, of course.


> > Obviously a C char or
> > byte must eventually by stored in real memory, presumably in whatever
> > physically addressable units that the hardware actually provides (the
> > "hardware byte" under discussion).  The C standard continues to impose
> > no required relationship between the hardware byte and the C byte/
> > char.
>
> Explicitly, no.  But, you can see remants of it in the spec, if you look.  A
> char being a minimum of 8-bits in limits.h is one such case.  3.6 and 3..7.1
> don't say it must be 8-bits or larger.  Technically, at the time C89 was
> defined only ASCII and EBCDIC were in use.  I.e., a char could've been
> defined with a minimum of 7-bits.  So, why do you think it's 8-bits?  I
> think it's 8-bits because C's with 8-bit chars and 8-bit bytes were used to
> create C89.


Clearly much of C89 was an attempt to codify existing practice.

True enough, 3.6 and 3.7 don't require a minimum of 8 bits but that
happens elsewhere. So what?

And machines with six, nine and ten bit characters were in
(reasonably) common use at the time C89 was being written. Some even
had C implementations.

They set some minimums, because that's useful. Obviously other
minimums are implied in various ways (for example, you couldn't have
six bit chars, because there are too many required characters in the
basic set). Why did they settle on eight bits when seven would have
done? No existing practice, for one, and little point for another
(what would be the odds that someone would actually build such a
machine?). It also helps the programmer by setting a useful minimum -
an eight bit byte will, in fact, accommodate the vast majority of the
worlds stored data (at least at that time – now there’s a bunch stored
in Unicode too).

A related example, why did they set the minimum size of a long to be
32 bits? The changes in the standard required to make it 16 bits
would be trivial (and there are plenty of machines on which 32 bits is
*not* a natural type), but I'm happy they chose the larger value,
since it makes my life easier.


> > Most real implementations will, of course, attempt a mapping
> > between the two that is simple and efficient (eg. a C char/byte is
> > implemented as a conventional 8-bit hardware byte), unless there is
> > some really compelling reason to do otherwise.
>
> A char's value is accessible in C, but a char is not addressable according
> to the spec. you quoted.  A byte is addressable according to the spec. you
> quoted, but a byte's value may not be entirely accessible in C.  Only the
> part of byte which overlaps with a char is accessible.  The byte represents
> the hardware addessability issue which has to be solved by a real
> implementation to implement objects in C comprised of C char's as contiguous
> sequences of bytes on hardware.  Does that make more sense?


That's true of a hardware byte. Not a C byte. The implementation
creates some mapping between C bytes and hardware bytes, not
necessarily 1:1 (an implementation on a 9 bit machine could map nine 8
bit C bytes into eight 9-bit hardware bytes), or even using the
entirety of the hardware bytes (an implementation on a 9 bit machine
might ignore one bit of each hardware byte, thus making the mapping of
eight bit C bytes onto the hardware bytes appear to be 1:1, while
leaving that ninth bit completely hidden from the C program). But you
cannot use those hidden bit in other types, either – IOW, if you’ve
hidden that bit from a char, you cannot use it in an int.

They also require that the C types are binary, despite there
(historically) having been numerous decimal machines. This would make
an implementation of C on a decimal machine quite painful, although
not impossible - you could for example, map three 8 bit C chars onto a
eight decimal digit word (aka hardware byte), and ignore the extra
range. The packing and unpacking of those values would be painful, to
say the least.

Again, the exact size (or representation) of hardware bytes is *not*
specified by the C standard, which defines only C bytes. The
implementation must establish some mapping between the two. Obviously
we'd prefer such a mapping is easy and efficient (as presumably would
the folks on the C committee), so it's not surprising that the
requirements for C mapping fairly well onto common hardware (and that,
of course, works in both directions).


> > > > (...)
> > >> A word addressed machine
> > >> with 32 bit words (or hardware bytes), would need to generate code to
> > >> pack and unpack four C chars (again assuming we wanted the
> > >> implementation to have 8 bit C chars), from a single word as needed.
>
> > >Not necessarily.  It could implement chars as 32 bit words or some other
> > >combination larger than 8-bits.
>
> > What part of "assuming we wanted the implementation to have 8 bit C
> > chars" did you miss in the above?
>
> Nothing AFAICT.  You can use a single 32-bit word to implement a single
> 8-bit char if you choose...  It might be wasteful of space but quickest or
> easiest to implement.  In which case, there is no need to pack and unpack
> four C chars, which clearly explains the "Not necessarily."


Unused bytes *between* objects (whether for alignment, some other whim
of the compiler), are a different (and mostly irrelevant) issue. You
can, absolutely, create an implementation where only eight bits of
each (let's say 32 bit for the sake of discussion) word are used.
That would lead to an array of chars being a sequence of (32 bit)
words. You cannot, however, then put an int (or long) into a single
32 bit word. That would break the ability to copy the array of bytes
that make up the int into an array of chars, and leave the values
intact (see the extended quote from footnote 40 above, for example),
and would also make the result of sizeof illogical.


> > > > All of which is irrelevant, except to implementation.
>
> > > So, why'd you bring it up?
>
> > Because you did, by appearing to conflate hardware bytes and C bytes.
>
> !?!?!...  (Interesting, Phil likes to use "conflate" too...)


Good for Phil. "Conflate" is an excellent word, of the best quality.
It should be used more often.


> > > > If you wanted to implement a system with 16 bit C bytes (and thus 16
> > > > bit C chars), on a 8-bit-byte addressed machine, the compiler will
> > > > have to generate code so that all char accesses address a pair of 8-
> > > > bit hardware bytes. And the smallest addressable unit in the C
> > > > program will be that 16 bit C char.
>
> > > > Nor is your assertion that hardware with a 9-bit hardware byte
> > > > requires a 9-bit C byte and char true.
>
> > > Nowhere did I say that... Reread.
>
> > "If the smallest native addressable unit is 9-bits, that's a C byte"
> > appears to refer to hardware bytes, both in isolation and in context.
>
> True.
>
> > If that's not what you meant, then my comment was superfluous.
>
> That's exactly what was meant.  Nowhere did I say this was "required".
> Nowhere did I "assert".  These are extra attributes you applied to the
> example in the discussion.  Reread.
>
> > > > While that might well make for
> > > > a convenient implementation on the machine, there is no reason that
> > > > the implementation might not expose 8 bit C bytes and chars, and
> > > > synthesize those out of the underlying 9-bit hardware bytes.
>
> > > True. Haven't we been over this? Either this time or last time? May of
> > > last year...
>
> > Yes.  And you basically refused to acknowledge that the C standard is
> > not described in terms of real hardware,
>
> Did I?  (From the same para even that your FWIW came from...)
>
> FWIW: RP: The "minimum model" requirements for C aren't part of the
> definition of a "virtual machine," or of C's "abstract machine," or even
> included in the C standards...
>
> > Yes.  And you basically refused to acknowledge that the C standard is
> > not described in terms of real hardware,
>
> Wrong.  I said it's impossible to entirely abstract C from real hardware.
> I've also said (maybe not in that thread...) that it's impossible to
> understand C completely without understanding how it fits onto real
> hardware.  You said C was implemented on some "virtual machine"...   What a
> crock!  A complete bastardization of 5.1.2.3 that you and Phil used as a
> justification.  An the "abstract machine" in 5.1.2.3 doens't refer to an
> "abstract machine" in the normal sense or a "virtual machine."  Both of
> these are execution or interpretation environments implemented on real
> hardware, usually in software.  An "abstract machine" in 5.1.2.3 refers to
> an imaginary unimplementable context that ensures proper C program
> execution.
>
> >  and went away in a huff...
>
> > FWIW: RP: "This, of course, is due to his continued belief in the pure
> > abstraction of C from the underlying hardware and assembly: a
> > fallacy."
>
> I "went away in a huff..."?  Where do you get that from?  You bailed out.
> Phil insulted and bailed out.  My last post is after you two.


When Phil wondered why you didn't respond to my post, you said "No. I
decided it wasn't in my best interest to pursue the conversation with
RW." If you've decided not to pursue the conversation with me, do you
expect me to continue talking to myself?

But I have to simply disagree about the need to consider history when
reading the C standard. It may help you to understand why certain
things are the way they are, may illustrate which of several choices
an implementation may make might be the "better" one, and may well
help you understand the standard. The whole point of the C standard
is to provide the complete* definition, explicitly so that knowledge
of history is not required. They may not succeed 100%, but they come
pretty darn close.

*Complete within the context that the standard was written. It does
not, for example, define the term "binary," or much other industry
jargon.


> > Any given implementation of course relates the two, but the C standard
> > itself does not.
>
> If the C spec. does not relate the two, then the C spec. itself is
> unimplementable.  No version of C can comply with the spec without this
> relationship being defined.  What is unimplementable is worthless.  There is
> no exception to this fact.  You need to learn to read between the lines of
> the C spec or add in historical context.


There you go, conflating again... ;-)

The implementation creates the mapping. It defines the relationship
That's it's job. Consider the IEEE math standard. It also does not
talk about any physical implementation, at best it talks about
patterns of bits, and what various operations do to those patterns of
bits. Sort of like the C standard. In both cases I can write a
program that does something well defined, with no reference to any
particular implementation. If I actually hope to run my program, I'm
going to have to find an implementation.

In both cases, various attributes of a particular implementation might
well be visible to me, and I might well make use of such information.
For example, the exact sizes of various types - for example, the
number of bits in an IEEE extended double, or the order in which those
bits are stored, or with what padding. Those details might well be
important to a particular program, but assuming I write a program that
depends on a 128 bit extended double (as opposed to the minimum 80
bit), I clearly restrict the its portability (even more so than just
assuming that a minimum extended double exists at all, since it's an
optional type).


> > Also obviously hardware with odd parameters may make
> > C difficult to implement in various ways, and also clearly one reason
> > that C is broadly popular is that most hardware does *not* produce
> > significant difficulties for a C implementer.
>
> Yes, exactly as I described in May 07:
>
> FWIW: RP: I hinted at the truth by referring to statements by Alex Stepanov
> in 1995, the primary creator of the C++ STL.  He stated that Dennis Ritchie
> designed C around a minimum model of computers which were well designed to
> solve numerical problems: byte addressable memory, flat address spaces, and
> pointers.  He claimed that this minimum model, developed over many decades
> using real computers, is the reason C is a success.
>
> > The reverse is true as
> > well, it's hard to image a modern hardware designer not taking ease of
> > C implementation into account when designing an architecture.


Which is all true, but all irrelevant. C has certainly, and usefully,
been implemented on machines which do not meet that minimal model.
Ask anyone who's ever compiled a large model C program on 16 bit x86
(where pointers and the address space are most assuredly not flat).
Or someone who's used C on a non-byte addressable microcontroller
where the implementers decided to implement large chars (instead of
synthesizing them out of the actual words).


> Does the x64 instruction set support 8-bit bytes?


Assuming you mean x86-64, yes.


> > The bottom line is this:  The C standard uses the terms byte and char
> > essentially synonymously,
>
> That may be true.  And, I think that the fact that most C's used to derive
> C89 had 8-bit bytes and 8-bit chars is likely the reason, which I stated
> previously.  But, some part of the spec. must accurately describe things.
> If there's a discrepancy, then the issue must be resolved by the more
> "authoratative" section.  In this case, 3.6 and 3.7.1, the sections which
> actually define the terms in question, should be considered "authoratative,"
> IMO.  Don't you agree?
>
> > and further [a byte and char] must appear to be the same size
> > from a C program's perspective.
>
> False.  A byte is not an accessible unit from C's perspective.  A char is
> accessible.  If bytes are 9-bits, and chars are 8-bits, there is no way for
> me to access the 9th bit of a byte from C.  You can only access the lower 8
> bits of the byte, which are the 8-bit char in this case.
>
> > A byte is the unit of storage, the
> > type is a char.
>
> True.
>
> > A char must fit in a byte
>
> True.
>
> > and a byte must fit in a
> > char
>
> False.


Again you insist on conflating the C notion of a byte and the hardware
concept.

The C standard does not talk about the implementation, it defines what
a C program (and thus a C compiler, aka implementation) needs to
appear to do. It makes those definitions in terms of an abstract
machine, within which it defines various local terms like "byte". The
implementation maps that onto real (for some definition of real)
hardware. You appear to have some fundamental objection to that state
of affairs, which I am failing to understand, but that is the way it
is.
From: robertwessel2 on
On Dec 6, 6:17 am, Phil Carmody <thefatphil_demun...(a)yahoo.co.uk>
wrote:
> "robertwess...(a)yahoo.com" <robertwess...(a)yahoo.com> writes:
> > And I want to mention that I quote from the C99 standard more often
> > only because I have that in electronic form, and only hardcopies of
> > C89, which makes for less typing...
>
> If you ask on c.l.c, someone will furnish you with pointers to
> older versions, I'm sure. I used to have copies of various things
> pre-C99, but don't any more, alas.


I have the common copy of the C89 draft, but I hesitate to quote from
it because the section numbering is quite different. And there are a
few (minor) substantive changes, too.
From: Rod Pemberton on
<robertwessel2(a)yahoo.com> wrote in message
news:b8426bcf-7cca-474c-85f3-67daab581e07(a)x38g2000yqj.googlegroups.com...
> On Dec 7, 12:20 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
> > <robertwess...(a)yahoo.com> wrote in message
> >
> > news:8d725647-529b-40ae-a97e-ad6d296e10c4(a)33g2000yqm.googlegroups.com...
> > On Dec 6, 3:36 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
> >
> > > <robertwess...(a)yahoo.com> wrote in message
> >
> > > > > "Values stored in non-bit-field objects of any other object type
> > > > > consist of n * CHAR_BIT bits, where n is the size of an object of
> that
> > > > > type, in bytes. The value may be copied into an object of type
> > > > > unsigned char [n] (e.g., by memcpy); the resulting set of bytes is
> > > > > called the object representation of the value."
[...]
> >
> > Where? Unless you misquoted, it says:
> > 1) the object consists of n*CHAR_BITS
> > 2) n is the number of bytes needed to build that object
> > 3) the resulting set of bytes is the objects representation
> > 4) an object comprised of some char's can be copied into a bunch of
bytes
> >
>
> Ugh. Your first #4 is plainly wrong.

No.

> The standard says the N bytes
> of an object can be copied into an array N unsigned chars.

It might. But, it's definately not part of what you quoted above...

> "Values stored in non-bit-field objects of any other object type
> consist of n * CHAR_BIT bits, where n is the size of an object of that
> type, in bytes."
>
> - an object is stored in N bytes,

#2)

> each with CHAR_BIT bits

No. First, "object" definately isn't CHAR_BITS in size. Second, it doesn't
say "bytes" is CHAR_BITS in size either. "Object" and "bytes" are the
only two nouns in the first part of your phrase, one of which much represent
"each." I took it to be "bytes" due to context and proximity. What that
does say is the values, of certain types of objects, are n*CHAR_BITS bits
with n being obtained from the object's size in bytes. I.e., if
sizeof(long)==4, then the values occupy total bits of 4*CHAR_BITS. While
the number of bytes needed, N, is mentioned, there is no mention of bytes
being a certain number of bits, AFAICT. (BTW, do you diagram sentences as
you read them?)

> "The value may be copied into an object of type unsigned char [n]
> (e.g., by memcpy);"
>
> - Those N bytes,

No. It doesn't say "N bytes". It says, "The value... may be copied."

> or at least the values contained therein,

Yes.

> can be
> copied in to an array of N chars.

No. It doesn't say into an array of "N chars" anywhere. It says "The
value... may be copied" into "an object" which is "of type unsigned
char[n]". An object of type unsigned char[n] is comprised of N bytes, see
#2.

> "the resulting set of bytes is called the object representation of the
> value."
>
> - that array of chars is also a bunch of bytes

No.

An "array" (in quotes since C doesn't actually have arrays, only array
declarations...), uh, never mind... A sequence of C chars has values which
are storable in a sequence of C bytes. But, a sequence of C bytes has
values which aren't necessarily storable in a sequence of C chars.

> And note that memcpy() is defined to move *chars*.
....

> So memcpy, which
> moves chars, also happens to move the same number of bytes (by the
> above).

No. It only has to move the char's values, not bytes. I.e., if both C and
hardware bytes are 9-bits and C char's are 8-bits, then the C char's values
are 8-bits. It only has to copy the 8-bit values into a new set of 9-bit C
or hardware bytes. I.e., it could clear the 9th bit, logical or 8-bits
leave whatever garbage is in the 9th bit, or set the 9th bit because the 9th
bit can't be accessed via C.

[AFAICT, snip unrelated]
> It's completely unclear how you think 3.6 and 3.7.1 contradict each
> other.

What? That response would've made sense further below in your statements,
but not here...

> "A byte contains CHAR_BIT bits, and the values of type unsigned char
> range from 0 to 2**CHAR_BIT - 1." So a byte contains CHAR_BIT bits.
> And the numbers that you can put in an unsigned char exactly
> correspond to that. It's not "unsigned char ranges from zero to no
> more than (2**CHAR_BIT - 1)" - rather the range is exact. So a byte
> contains exactly the number of bits that can fit in an unsigned char.
> And an unsigned char can hold exactly the number of different values
> that can fit in a byte.

"(Adapted from the American National Dictionary for Information Processing
Systems.)"

It probably should read:
A) "An unsigned char contains CHAR_BIT bits, and the values of type
unsigned char range from 0 to 2CHAR_BIT - 1."
B) "If a byte contains CHAR_BIT bits, then the values of type unsigned
char range from 0 to 2CHAR_BIT - 1."

I think they used "unsigned char" as a synonym for a byte. It's a typo.

> > > Most real implementations will, of course, attempt a mapping
> > > between the two that is simple and efficient (eg. a C char/byte is
> > > implemented as a conventional 8-bit hardware byte), unless there is
> > > some really compelling reason to do otherwise.
> >
> > A char's value is accessible in C, but a char is not addressable
according
> > to the spec. you quoted. A byte is addressable according to the spec.
you
> > quoted, but a byte's value may not be entirely accessible in C. Only the
> > part of byte which overlaps with a char is accessible. The byte
represents
> > the hardware addessability issue which has to be solved by a real
> > implementation to implement objects in C comprised of C char's as
> contiguous
> > sequences of bytes on hardware. Does that make more sense?
>
> That's true of a hardware byte. Not a C byte.

It's true of a C byte too.

> The implementation
> creates some mapping between C bytes and hardware bytes, not
> necessarily 1:1 (an implementation on a 9 bit machine could map nine 8
> bit C bytes into eight 9-bit hardware bytes), or even using the
> entirety of the hardware bytes (an implementation on a 9 bit machine
> might ignore one bit of each hardware byte, thus making the mapping of
> eight bit C bytes onto the hardware bytes appear to be 1:1, while
> leaving that ninth bit completely hidden from the C program). But you
> cannot use those hidden bit in other types, either - IOW, if you've
> hidden that bit from a char, you cannot use it in an int.

Let's say the C byte is 9-bits because the hardware byte is 9-bits. But,
the C char is 8-bits. How do you access the ninth bit of the C byte in C?
(You can't.) The byte is the addressable unit which can address 9-bits both
in C and on hardware. But, the char is the "value unit" which can only
access 8 of those 9-bits. I.e., your "byte must fit in a char" doesn't work
in this legal example. This is entirely *independent* of whether you think
I'm "conflating" C bytes and hardware bytes.

> Unused bytes *between* objects (whether for alignment, some other whim
> of the compiler), are a different (and mostly irrelevant) issue.

Not irrelevant. It directly affects your understanding of 3.6 and 3.7.1.

> You
> can, absolutely, create an implementation where only eight bits of
> each (let's say 32 bit for the sake of discussion) word are used.
> That would lead to an array of chars being a sequence of (32 bit)
> words. You cannot, however, then put an int (or long) into a single
> 32 bit word.

You could, but not spec. compliantly. Every object being representable as a
sequence of char's would be broken and you'd have to ensure memcpy() copied
32-bits behind the scenes instead of C chars. You'd probably want to limit
int, long etc. to 32-bits or whatever size the modified memcpy() was using.
Int's and long's would be only accessible as int's and long's, not via
char's. The offset operator should still work allowing "arrays"...

> That would break the ability to copy the array of bytes
> that make up the int into an array of chars, and leave the values
> intact (see the extended quote from footnote 40 above, for example),

True.

> and would also make the result of sizeof illogical.

No. If sizeof(long)==4 and long is 32-bits, then given a long "broken up"
over 4 8-bit char's which consume 32-bits each, sizeof(long)==4 is still
valid. I.e., it's still four char's. The value returned by sizeof would
only be illogical if you don't "break up" long's and int's into char's as
required.

> Good for Phil. "Conflate" is an excellent word, of the best quality.
> It should be used more often.

It's also used much by one individual on comp.lang.c.

> When Phil wondered why you didn't respond to my post, you said "No. I
> decided it wasn't in my best interest to pursue the conversation with
> RW."

I'm still not sure it's in my best interest... ;)

I recall getting into a number of long drawn out conversations on C in
threads and NG's unrelated to C and was really tired of that. I don't
immediately recall if one of those was with you. I do know I've a few with
Phil, and a few with others who frequent comp.lang.c. They tend to harass
those even off c.l.c. with their frequently incorrect "understanding" of C,
IMO. The usual c.l.c. response:

100+ individuals claim you are wrong without being able to prove it
50+ individuals insult you without remorse
10 individuals claim you are wrong but use faulty logic
1 individual attempts to prove you are wrong but can't do so
1 individual makes an almost correct proof, by ignoring a fact or two

> If you've decided not to pursue the conversation with me, do you
> expect me to continue talking to myself?

Continue "writing" to yourself...? Are you currently talking to yourself?
If so, then I expect you'll likely continue... ;)

You seemed to have waited over a year to resume this conversational topic
with me on C in an assembly NG. Coincidence?

> But I have to simply disagree about the need to consider history when
> reading the C standard.

Really? BTW, which standard?... There are at least four, IMO. And, you
can't decide which to *one* of them to read without "need[ing] to consider
history." There are sufficiently large differences between them.

> It may help you to understand why certain
> things are the way they are, may illustrate which of several choices
> an implementation may make might be the "better" one, and may well
> help you understand the standard. The whole point of the C standard
> is to provide the complete* definition,

That's the problem: one can't provide a complete definition for C unless the
hardware is 100% identical on every platform. One can only provide a
somewhat complete definition if the language uses the bare minimum de-facto
features of the computing hardware: basic arithmetic, addresses, integers,
contiguous memory, byte-sized memory, etc. I.e., these underlying
characteristics are implicitly standardized by the C standard. It doesn't
matter if the C spec. doesn't mention them explicitly. They are a
requirement to implementing C.

> explicitly so that knowledge
> of history is not required. They may not succeed 100%, but they come
> pretty darn close.

How do setjmp and longjmp fit? How do arg's passed from the environment
fit? How do you implement realloc without access to underlying OSes memory
allocator? How do you implement the C library if you don't have a *nix
concept of files and equivalents to unistd.h like functions:
open,close,read,write,lseek?

They succeeded in abstracting that which was abstractable: grammar syntax,
arithmetic, types, and or portable due to de-facto hardware standardization.
If C didn't have 15+ years of usage prior to C89 which proved some
"portability" or "adaptability" of the language, do you think C would have
ever been standardized? Of course not, it'd have died out.

> *Complete within the context that the standard was written. It does
> not, for example, define the term "binary," or much other industry
> jargon.

Nor, does it properly define a byte or an abstract machine or many other
standardized terms ... etc.

> > > Any given implementation of course relates the two, but the C standard
> > > itself does not.
> >
> > If the C spec. does not relate the two, then the C spec. itself is
> > unimplementable. No version of C can comply with the spec without this
> > relationship being defined. What is unimplementable is worthless. There
is
> > no exception to this fact. You need to learn to read between the lines
of
> > the C spec or add in historical context.
>
> There you go, conflating again... ;-)

No. I believe I stated the truth accurately.

> The implementation creates the mapping. It defines the relationship
> That's it's job. Consider the IEEE math standard. It also does not
> talk about any physical implementation, at best it talks about
> patterns of bits, and what various operations do to those patterns of
> bits. Sort of like the C standard. In both cases I can write a
> program that does something well defined, with no reference to any
> particular implementation.

The last sentence is true but only to a limited degree. It is nowhere near
true in it's entirety. You can't expect to use argv/argc parameters to main
portably, realloc portably, setjmp/longjmp portably, files portably,
structures accessed as "arrays" portably, escape characters portably, int
portably, getenv portably, signals portably, offsetof portably, errno
portably, exit portably, etc.

In the case of C, C is constrained by that which is common among differing
computer architectures. Yet, when I tell you that by providing specific
quotes of Alex Stepanov or perhaps if I said "C captures the essence of RISC
but not CISC," you blantantly declare it to be true but irrelevant. E.g.,
RW said: "Which is all true, but all irrelevant." So, your point here must
be irrelevant too, from your perspective. I.e., your current point CANNOT
be RELEVANT *AND* at the same time BE IRRELEVANT for my other point when
they're are both based on platform commonality. That's irrational,
illogical, and contradictory.

> > False. A byte is not an accessible unit from C's perspective. A char is
> > accessible. If bytes are 9-bits, and chars are 8-bits, there is no way
for
> > me to access the 9th bit of a byte from C. You can only access the lower
8
> > bits of the byte, which are the 8-bit char in this case.
> >
> > > A byte is the unit of storage, the
> > > type is a char.
> >
> > True.
> >
> > > A char must fit in a byte
> >
> > True.
> >
> > > and a byte must fit in a
> > > char
> >
> > False.
>
>
> Again you insist on conflating the C notion of a byte and the hardware
> concept.

No.

This is what 3.6 and 3.7.1 say:

1) byte is the (addressable) unit of storage - "byte: addressable unit of
data storage"
2) char must fit in a byte - "character ... bit representation that fits
in a byte"
3) byte can't be smaller than a char, i.e., it must be equal or larger in
size - "byte: ... large enough to hold"

"3.6: byte: addressable unit of data storage large enough to hold any
member of the basic character set of the execution environment."

"3.7.1: character - single-byte character <C> bit representation that
fits in a byte."

> The C standard does not talk about the implementation, it defines what
> a C program (and thus a C compiler, aka implementation) needs to
> appear to do. It makes those definitions in terms of an abstract
> machine, within which it defines various local terms like "byte". The
> implementation maps that onto real (for some definition of real)
> hardware. You appear to have some fundamental objection to that state
> of affairs,

I think you didn't fully understand what you read...


Rod Pemberton



From: Glen Herrmannsfeldt on
robertwessel2(a)yahoo.com wrote:
(snip)

> Nor is your assertion that hardware with a 9-bit hardware byte
> requires a 9-bit C byte and char true. While that might well make for
> a convenient implementation on the machine, there is no reason that
> the implementation might not expose 8 bit C bytes and chars, and
> synthesize those out of the underlying 9-bit hardware bytes. It
> could, for example, store 9 C chars in 8 hardware bytes, or store one
> C char per hardware byte, and ignore one bit of each hardware byte.
> In fact, you might be tempted to do such a thing if you wanted to port
> much existing C code to your 9-bit-byte machine, simply because so
> much code will break if CHAR_BIT is not 8.

Well, sizeof(int) must be an integer, so on a machine with
9 bit hardware and, for example, a 36 bit int then the C byte
could not be 8 bits.

More specifically, the PDP-10 is a 36 bit word addressed machine
which has the ability to load/store "bytes" smaller than 36 bits.
The possible CHAR_BIT for such machines are 9, 12, 18, and 36.
Since it does have operations on 18 bit halfwords it is likely
that short would be 18 bits leaving 9 and 18 bits for C char.

> It's also arguable that a hosted implementation cannot have sizeof
> (char) == sizeof(int), because of assumptions in the library (notably
> you cease being able to assign a unique value to EOF that cannot be
> returned by the character I/O functions). A freestanding
> implementation (common, of course, on DSPs), doesn't have those
> issues.

I believe it is done on word addressed machines.

In most cases, it is important that the EOF value must not
be the value of a character in the character set. Machines
with 32 bit char most likely don't actually use all
the values. (UTF-32 is pretty rare.)

-- glen

From: robertwessel2 on
On Dec 9, 2:00 pm, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
> <robertwess...(a)yahoo.com> wrote in message
>
> news:b8426bcf-7cca-474c-85f3-67daab581e07(a)x38g2000yqj.googlegroups.com...
>
> > On Dec 7, 12:20 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
> > > <robertwess...(a)yahoo.com> wrote in message
>
> > >news:8d725647-529b-40ae-a97e-ad6d296e10c4(a)33g2000yqm.googlegroups.com....
> > > On Dec 6, 3:36 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
>
> > > > <robertwess...(a)yahoo.com> wrote in message
>
> > > > > > "Values stored in non-bit-field objects of any other object type
> > > > > > consist of n * CHAR_BIT bits, where n is the size of an object of
> > that
> > > > > > type, in bytes. The value may be copied into an object of type
> > > > > > unsigned char [n] (e.g., by memcpy); the resulting set of bytes is
> > > > > > called the object representation of the value."
> [...]
>
> > > Where? Unless you misquoted, it says:
> > > 1) the object consists of n*CHAR_BITS
> > > 2) n is the number of bytes needed to build that object
> > > 3) the resulting set of bytes is the objects representation
> > > 4) an object comprised of some char's can be copied into a bunch of
> bytes
>
> > Ugh.  Your first  #4 is plainly wrong.
>
> No.
>
> > The standard says the N bytes
> > of an object can be copied into an array N unsigned chars.
>
> It might.  But, it's definately not part of what you quoted above...
>
> > "Values stored in non-bit-field objects of any other object type
> > consist of n * CHAR_BIT bits, where n is the size of an object of that
> > type, in bytes."
>
> >  - an object is stored in N bytes,
>
> #2)
>
> > each with CHAR_BIT bits
>
> No.  First, "object" definately isn't CHAR_BITS in size.  Second, it doesn't
> say "bytes" is CHAR_BITS in size either.  "Object" and "bytes" are the
> only two nouns in the first part of your phrase, one of which much represent
> "each."  I took it to be "bytes" due to context and proximity.  What that
> does say is the values, of certain types of objects, are n*CHAR_BITS bits
> with n being obtained from the object's size in bytes.  I.e., if
> sizeof(long)==4, then the values occupy total bits of 4*CHAR_BITS.  While
> the number of bytes needed, N, is mentioned, there is no mention of bytes
> being a certain number of bits, AFAICT.  (BTW, do you diagram sentences as
> you read them?)
>
> > "The value may be copied into an object of type unsigned char [n]
> > (e.g., by memcpy);"
>
> > - Those N bytes,
>
> No.  It doesn't say "N bytes".  It says, "The value... may be copied."
>
> > or at least the values contained therein,
>
> Yes.
>
> > can be
> > copied in to an array of N chars.
>
> No.  It doesn't say into an array of "N chars" anywhere.  It says "The
> value... may be copied" into "an object" which is "of type unsigned
> char[n]".  An object of type unsigned char[n] is comprised of N bytes, see
> #2.


How is "an array of N chars" different from "an object of type
unsigned char[n]"? Ignoring the signedness, which I've explicitly
ignored several times.

So an object has a value consisting of N*CHAR_BITS bits
("Values...consist of n * CHAR_BIT bits"), the same object is N bytes
long ("where n is the size of an object of that type, in bytes"), the
value can be stored in an array of N unsigned chars ("The value may be
copied into an object of type unsigned char [n]"), and further,
CHAR_BITS is defined as specifying the size of a byte (in 5.2.4.2.1 -
"number of bits for smallest object that is not a bit-field (byte)") .

I simply cannot see where there's any wiggle room. Unless you want to
declare a second typo in the CHAR_BITS definition, and even then it
requires a very unusual reading of the first three parts to get to
your position.


> > "A byte contains CHAR_BIT bits, and the values of type unsigned char
> > range from 0 to 2**CHAR_BIT - 1."  So a byte contains CHAR_BIT bits.
> > And the numbers that you can put in an unsigned char exactly
> > correspond to that.  It's not "unsigned char ranges from zero to no
> > more than (2**CHAR_BIT - 1)" - rather the range is exact.  So a byte
> > contains exactly the number of bits that can fit in an unsigned char.
> > And an unsigned char can hold exactly the number of different values
> > that can fit in a byte.
>
> "(Adapted from the American National Dictionary for Information Processing
> Systems.)"
>
> It probably should read:
>   A) "An unsigned char contains CHAR_BIT bits, and the values of type
> unsigned char range from 0 to 2CHAR_BIT - 1."
>   B) "If a byte contains CHAR_BIT bits, then the values of type unsigned
> char range from 0 to 2CHAR_BIT - 1."
>
> I think they used "unsigned char" as a synonym for a byte.  It's a typo..


Seriously? A sentence for which you cannot find a convoluted parsing
that supports your position is a typo?!

And your additional quote is flatly incorrect. The part you added
("Adapted from...)" applies to the preceding sentence.


> Let's say the C byte is 9-bits because the hardware byte is 9-bits.  But,
> the C char is 8-bits.  How do you access the ninth bit of the C byte in C?
> (You can't.)  The byte is the addressable unit which can address 9-bits both
> in C and on hardware.  But, the char is the "value unit" which can only
> access 8 of those 9-bits.  I.e., your "byte must fit in a char" doesn't work
> in this legal example.  This is entirely *independent* of whether you think
> I'm "conflating" C bytes and hardware bytes.


If the ninth bit is completely inaccessible, it's not meaningful from
the perspective of the C program, no?

If the C byte has extra bits in it, how do they affect the C program?
If they do not, then they may as well not exist.


> > When Phil wondered why you didn't respond to my post, you said "No.  I
> > decided it wasn't in my best interest to pursue the conversation with
> > RW."
>
> I'm still not sure it's in my best interest...  ;)
>
> I recall getting into a number of long drawn out conversations on C in
> threads and NG's unrelated to C and was really tired of that.  I don't
> immediately recall if one of those was with you.  I do know I've a few with
> Phil, and a few with others who frequent comp.lang.c.  They tend to harass
> those even off c.l.c. with their frequently incorrect "understanding" of C,
> IMO.  The usual c.l.c. response:
>
>   100+ individuals claim you are wrong without being able to prove it
>   50+ individuals insult you without remorse
>   10 individuals claim you are wrong but use faulty logic
>   1 individual attempts to prove you are wrong but can't do so
>   1 individual makes an almost correct proof, by ignoring a fact or two
>
> > If you've decided not to pursue the conversation with me, do you
> > expect me to continue talking to myself?
>
> Continue "writing" to yourself...?  Are you currently talking to yourself?
> If so, then I expect you'll likely continue...  ;)
>
> You seemed to have waited over a year to resume this conversational topic
> with me on C in an assembly NG.  Coincidence?


I didn't choose the resurrect a long forgotten conversation with you.
You posted, in a new thread, incorrect information, which happened to
be similar to that of the 5-07 thread. I responded to that. You
brought up the old thread.


> > But I have to simply disagree about the need to consider history when
> > reading the C standard.
>
> Really?  BTW, which standard?...  There are at least four, IMO.  And, you
> can't decide which to *one* of them to read without "need[ing] to consider
> history."  There are sufficiently large differences between them.
>
> > It may help you to understand why certain
> > things are the way they are, may illustrate which of several choices
> > an implementation may make might be the "better" one, and may well
> > help you understand the standard.  The whole point of the C standard
> > is to provide the complete* definition,
>
> That's the problem: one can't provide a complete definition for C unless the
> hardware is 100% identical on every platform.  One can only provide a
> somewhat complete definition if the language uses the bare minimum de-facto
> features of the computing hardware: basic arithmetic, addresses, integers,
> contiguous memory, byte-sized memory, etc.  I.e., these underlying
> characteristics are implicitly standardized by the C standard.  It doesn't
> matter if the C spec. doesn't mention them explicitly.  They are a
> requirement to implementing C.
>
> > explicitly so that knowledge
> > of history is not required.  They may not succeed 100%, but they come
> > pretty darn close.
>
> How do setjmp and longjmp fit?  How do arg's passed from the environment
> fit?  How do you implement realloc without access to underlying OSes memory
> allocator?  How do you implement the C library if you don't have a *nix
> concept of files and equivalents to unistd.h like functions:
> open,close,read,write,lseek?


In what sense are the functions of setjmp and longjmp ambiguous? How
it's implemented is clearly very implementation dependent (in fact
most implementations require a couple of short snippets of assembler,
unlike most of the rest of the standard C library), but it functions
the same in all versions of C (modulo bugs and non-conformance). Same
with realloc - it has to do something specific - it will presumably
invoke an OS service to allocate memory in many environments, on at
least some occasions. So what? And the semantics of C files are what
they are, the implementation clearly need to figure out how to map
that onto something that's considered valuable in the local
environment.


> They succeeded in abstracting that which was abstractable: grammar syntax,
> arithmetic, types, and or portable due to de-facto hardware standardization.
> If C didn't have 15+ years of usage prior to C89 which proved some
> "portability" or "adaptability" of the language, do you think C would have
> ever been standardized?  Of course not, it'd have died out.
>
> > *Complete within the context that the standard was written.  It does
> > not, for example, define the term "binary," or much other industry
> > jargon.
>
> Nor, does it properly define a byte or an abstract machine or many other
> standardized terms ... etc.


Clearly it does not define them to your satisfaction.


> > The implementation creates the mapping.  It defines the relationship
> > That's it's job.  Consider the IEEE math standard.  It also does not
> > talk about any physical implementation, at best it talks about
> > patterns of bits, and what various operations do to those patterns of
> > bits.  Sort of like the C standard.  In both cases I can write a
> > program that does something well defined, with no reference to any
> > particular implementation.
>
> The last sentence is true but only to a limited degree.  It is nowhere near
> true in it's entirety.  You can't expect to use argv/argc parameters to main
> portably, realloc portably, setjmp/longjmp portably, files portably,
> structures accessed as "arrays" portably, escape characters portably, int
> portably, getenv portably, signals portably, offsetof portably, errno
> portably, exit portably, etc.


The argc/argv parameters are quire portable. How they get specified
when the program is run, is quite implementation specific (as is how
one actually runs a program), including how the specified input is
parsed into those parameters, but there's nothing unportable about
accessing the values passed into main. And most of the other examples
are simply wrong. A couple I've already addressed, and the others
(offsetof, for example) are quite possible to portably, if you observe
the specified restrictions. Same with ints. a=b+c; has a perfectly
portable meaning so long as b, c and the sum of the two, are within
-32767..+32767. It also has a perfectly defined meaning if the three
values have values between INT_MIN and INT_MAX. Code can, of course,
assume behavior when those limits are exceeded, and thus tie
themselves to a particular implementation (or set thereof).


> > Again you insist on conflating the C notion of a byte and the hardware
> > concept.
>
> No.


Ah, so you concede my point, since this is obviously a typo, and you
meant "yes"...


> > The C standard does not talk about the implementation, it defines what
> > a C program (and thus a C compiler, aka implementation) needs to
> > appear to do.  It makes those definitions in terms of an abstract
> > machine, within which it defines various local terms like "byte".  The
> > implementation maps that onto real (for some definition of real)
> > hardware.  You appear to have some fundamental objection to that state
> > of affairs,
>
> I think you didn't fully understand what you read...


Someone certainly isn't.

I'm not sure this is worth pursuing. You appear fully convinced of
your position, to the point where you appear to be willing to make
obviously false and illogical statements. Conversely, you accuse me
of doing the same. If you think either position might be changed, I
am willing to continue for a reasonable time.
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13
Prev: Win32 non blocking console input?
Next: hugi compo #29