From: robertwessel2 on
On Dec 9, 5:34 pm, Glen Herrmannsfeldt <g...(a)ugcs.caltech.edu> wrote:
> robertwess...(a)yahoo.com wrote:
>
> (snip)
>
> > Nor is your assertion that hardware with a 9-bit hardware byte
> > requires a 9-bit C byte and char true.  While that might well make for
> > a convenient implementation on the machine, there is no reason that
> > the implementation might not expose 8 bit C bytes and chars, and
> > synthesize those out of the underlying 9-bit hardware bytes.  It
> > could, for example, store 9 C chars in 8 hardware bytes, or store one
> > C char per hardware byte, and ignore one bit of each hardware byte.
> > In fact, you might be tempted to do such a thing if you wanted to port
> > much existing C code to your 9-bit-byte machine, simply because so
> > much code will break if CHAR_BIT is not 8.
>
> Well, sizeof(int) must be an integer, so on a machine with
> 9 bit hardware and, for example, a 36 bit int then the C byte
> could not be 8 bits.
>
> More specifically, the PDP-10 is a 36 bit word addressed machine
> which has the ability to load/store "bytes" smaller than 36 bits.
> The possible CHAR_BIT for such machines are 9, 12, 18, and 36.
> Since it does have operations on 18 bit halfwords it is likely
> that short would be 18 bits leaving 9 and 18 bits for C char.


There is nothing preventing an implementation implementing 8-bit C
chars/bytes on such a machine, it would just have to provide the
appropriate mapping functions. One would assume that there would be a
strong desire to use the more natural mappings (IOW, 9 bit chars) for
most implementations, although it's possible one could find
justification for the 8-bit char mapping (perhaps for compatibility
reason).

But my point was that a 9-bit hardware byte does not *require* a 9-bit
C char/byte, although, of course, that would be the most likely
implementation. Required != common != allowed.


>  > It's also arguable that a hosted implementation cannot have sizeof
>  > (char) == sizeof(int), because of assumptions in the library (notably
>  > you cease being able to assign a unique value to EOF that cannot be
>  > returned by the character I/O functions).  A freestanding
>  > implementation (common, of course, on DSPs), doesn't have those
>  > issues.
>
> I believe it is done on word addressed machines.
>
> In most cases, it is important that the EOF value must not
> be the value of a character in the character set.  Machines
> with 32 bit char most likely don't actually use all
> the values.  (UTF-32 is pretty rare.)


If a char is the same size as an int, and you putc() it to a binary
stream, what do you get back when you read that with getc? It must,
in fact, be the same value (C99 7.9.2 "A binary stream is an ordered
sequence of characters that can transparently record internal data.
Data read in from a binary stream shall compare equal to the data that
were earlier written out to that stream, under the same
implementation."). There are no limits on what values you can putc
(except that it has to be a char). That doesn't leave room for EOF if
int and char are the same size. Note that even putc (which returns
EOF if there's an I/O error) by itself is broken if a char is the same
size as an int.

Most (word addressed) machines where ints and chars are the same size
support a freestanding implementation, not a hosted one, and so
sidestep the problem. A freestanding implementation could provide a
non-conforming library where streams involve 8 bit characters and C
chars are 32 bits. Such a library might define streams in such a way
that char c = 0x12345641; putc(c, fp); would write a single ASCII
capital ‘A’ to the stream, but that is absolutely not the defined
semantics of a C stream. Such an implementation might even be more
useful to someone than one which defined streams to contain 32 bit
chars, but that’s not the issue.
From: H. Peter Anvin on
robertwessel2(a)yahoo.com wrote:
>
> There is nothing preventing an implementation implementing 8-bit C
> chars/bytes on such a machine, it would just have to provide the
> appropriate mapping functions. One would assume that there would be a
> strong desire to use the more natural mappings (IOW, 9 bit chars) for
> most implementations, although it's possible one could find
> justification for the 8-bit char mapping (perhaps for compatibility
> reason).
>
> But my point was that a 9-bit hardware byte does not *require* a 9-bit
> C char/byte, although, of course, that would be the most likely
> implementation. Required != common != allowed.
>

Actually, there are combinations of other requirements that make an
8-bit byte on such a machine at least very difficult to achieve. C does
require that all objects can be copied as an array of bytes, so, for
example, packing four 8-bit bytes into a 36-bit word, but having a
36-bit int, would not be permitted, because of the leftover bits.

-hpa
From: robertwessel2 on
On Dec 10, 1:25 am, "H. Peter Anvin" <h...(a)zytor.com> wrote:
> robertwess...(a)yahoo.com wrote:
>
> > There is nothing preventing an implementation implementing 8-bit C
> > chars/bytes on such a machine, it would just have to provide the
> > appropriate mapping functions.  One would assume that there would be a
> > strong desire to use the more natural mappings (IOW, 9 bit chars) for
> > most implementations, although it's possible one could find
> > justification for the 8-bit char mapping (perhaps for compatibility
> > reason).
>
> > But my point was that a 9-bit hardware byte does not *require* a 9-bit
> > C char/byte, although, of course, that would be the most likely
> > implementation.  Required != common != allowed.
>
> Actually, there are combinations of other requirements that make an
> 8-bit byte on such a machine at least very difficult to achieve.  C does
> require that all objects can be copied as an array of bytes, so, for
> example, packing four 8-bit bytes into a 36-bit word, but having a
> 36-bit int, would not be permitted, because of the leftover bits.


Absolutely correct. If you did such a thing you'd likely end up with
32 bit ints (in fact the likely justification for such a thing, code
portability with more common architectures, would presumably demand
that), and you'd basically be using only 32 bits out of each hardware
word (or packing 9 8-bit characters into a pair of 36 bit words). But
while that's ugly (and undoubtedly slow), it's not really complicated
in any way.

Even higher on the perversity scale, a 36 bit int *is* possible, but
would require 5 (8-bit) bytes, and four pad bits (or more bytes, of
course). Actually 8 bit chars, and 36 bit ints (and possibly 72 bit
long longs), required to be aligned on 9 (!!) byte (8-bit) boundaries,
might actually make a certain amount of sense...

My point was not that this was a good idea or a likely implementation,
but that there's not a particular requirement (other than general
sanity) that the native hardware sizes match in any sort of convenient
way the types as visible from a C program.

An analogous situation would occur if you wanted to implement Java on
a 36 bit machine. Except then you wouldn't even have the option of
going to 9/18/36/72 bit types (the types all being exact sizes in
Java).
From: Rod Pemberton on
<robertwessel2(a)yahoo.com> wrote in message
news:438de604-c6dd-4360-9c32-2d192dc866f1(a)u18g2000pro.googlegroups.com...
> On Dec 9, 2:00 pm, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
> > <robertwess...(a)yahoo.com> wrote in message
> >
> >
news:b8426bcf-7cca-474c-85f3-67daab581e07(a)x38g2000yqj.googlegroups.com...
> >
> > > On Dec 7, 12:20 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm> wrote:
> > > > <robertwess...(a)yahoo.com> wrote in message
> >
> > >
>news:8d725647-529b-40ae-a97e-ad6d296e10c4(a)33g2000yqm.googlegroups.com...
> > > > On Dec 6, 3:36 am, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm>
wrote:
> >
> > > > > <robertwess...(a)yahoo.com> wrote in message
> >
> > > > > > > "Values stored in non-bit-field objects of any other object
type
> > > > > > > consist of n * CHAR_BIT bits, where n is the size of an object
of
> > > that
> > > > > > > type, in bytes. The value may be copied into an object of type
> > > > > > > unsigned char [n] (e.g., by memcpy); the resulting set of
bytes is
> > > > > > > called the object representation of the value."
> > [...]
> >
> > > > Where? Unless you misquoted, it says:
> > > > 1) the object consists of n*CHAR_BITS
> > > > 2) n is the number of bytes needed to build that object
> > > > 3) the resulting set of bytes is the objects representation
> > > > 4) an object comprised of some char's can be copied into a bunch of
> > bytes
> >
> > > Ugh. Your first #4 is plainly wrong.
> >
> > No.
> >
> > > The standard says the N bytes
> > > of an object can be copied into an array N unsigned chars.
> >
> > It might. But, it's definately not part of what you quoted above...
> >
> > > "Values stored in non-bit-field objects of any other object type
> > > consist of n * CHAR_BIT bits, where n is the size of an object of that
> > > type, in bytes."
> >
> > > - an object is stored in N bytes,
> >
> > #2)
> >
> > > each with CHAR_BIT bits
> >
> > No. First, "object" definately isn't CHAR_BITS in size. Second, it
doesn't
> > say "bytes" is CHAR_BITS in size either. "Object" and "bytes" are the
> > only two nouns in the first part of your phrase, one of which much
represent
> > "each." I took it to be "bytes" due to context and proximity. What that
> > does say is the values, of certain types of objects, are n*CHAR_BITS
bits
> > with n being obtained from the object's size in bytes. I.e., if
> > sizeof(long)==4, then the values occupy total bits of 4*CHAR_BITS. While
> > the number of bytes needed, N, is mentioned, there is no mention of
bytes
> > being a certain number of bits, AFAICT. (BTW, do you diagram sentences
as
> > you read them?)
> >
> > > "The value may be copied into an object of type unsigned char [n]
> > > (e.g., by memcpy);"
> >
> > > - Those N bytes,
> >
> > No. It doesn't say "N bytes". It says, "The value... may be copied."
> >
> > > or at least the values contained therein,
> >
> > Yes.
> >
> > > can be
> > > copied in to an array of N chars.
> >
> > No. It doesn't say into an array of "N chars" anywhere. It says "The
> > value... may be copied" into "an object" which is "of type unsigned
> > char[n]". An object of type unsigned char[n] is comprised of N bytes,
see
> > #2.
>
> How is "an array of N chars" different from "an object of type
> unsigned char[n]"?

"an array of N chars" is comprised of N chars. "an object of type unsigned
char[n]" is comprised of N bytes. In the given context, they are not
synonymous as you treat them. The former is a set of values, which happen
to be chars. The later is a set of addressable units, which happen to be
bytes. I.e., I could use your favorite word for mixing, merging, confusing
the two, but I'd rather not.

> Ignoring the signedness, which I've explicitly
> ignored several times.
>
> So an object has a value consisting of N*CHAR_BITS bits

No. That object's value is *stored* in N*CHAR_BITS bits. The object's
value does not *comprise* N*CHAR_BITS, except when sizeof(char)==sizeof a
byte.

> ("Values...consist of n * CHAR_BIT bits"), the same object is N bytes
> long ("where n is the size of an object of that type, in bytes"),

Yes.

> the
> value can be stored in an array of N unsigned chars ("The value may be
> copied into an object of type unsigned char [n]"),

No. Disregarding the fact C doesn't have arrays (since everyone I've
presented this to fails to understand this....), the "value can be stored in
an "array" of N bytes, not "N unsigned chars". You keep confusing the
"unsigned char[n]" to mean "N unsigned chars". You need to understand that
you should be looking at "object of type unsigned char[n]" which means (or
is an abstraction representing) N bytes.

> and further,
> CHAR_BITS is defined as specifying the size of a byte (in 5.2.4.2.1 -
> "number of bits for smallest object that is not a bit-field (byte)") .

True.

> I simply cannot see where there's any wiggle room.

I understand that...

> Unless you want to
> declare a second typo in the CHAR_BITS definition,

No. There's no error there.

> and even then it
> requires a very unusual reading of the first three parts to get to
> your position.

I believe my understanding to be correct. I've shown where I believe your
understanding is incorrect.

> > > "A byte contains CHAR_BIT bits, and the values of type unsigned char
> > > range from 0 to 2**CHAR_BIT - 1." So a byte contains CHAR_BIT bits.
> > > And the numbers that you can put in an unsigned char exactly
> > > correspond to that. It's not "unsigned char ranges from zero to no
> > > more than (2**CHAR_BIT - 1)" - rather the range is exact. So a byte
> > > contains exactly the number of bits that can fit in an unsigned char.
> > > And an unsigned char can hold exactly the number of different values
> > > that can fit in a byte.
> >
> > "(Adapted from the American National Dictionary for Information
Processing
> > Systems.)"
> >
> > It probably should read:
> > A) "An unsigned char contains CHAR_BIT bits, and the values of type
> > unsigned char range from 0 to 2CHAR_BIT - 1."
> > B) "If a byte contains CHAR_BIT bits, then the values of type unsigned
> > char range from 0 to 2CHAR_BIT - 1."
> >
> > I think they used "unsigned char" as a synonym for a byte. It's a typo.
>
> Seriously? A sentence for which you cannot find a convoluted parsing
> that supports your position is a typo?!
>
> And your additional quote is flatly incorrect.

Really?

> The part you added
> ("Adapted from...)" applies to the preceding sentence.

True. But, to say "your additional quote is flatly incorrect" in regards to
the following line when it is written in context of the prior line is
clearly "flatly incorrect." They didn't start another paragraph there.
They continued with their example building on the prior statements which was
defined in terms of an "industry standard" definition. If I saw the last
line completely outside the context of the C standard, I'd say it's correct.
I understand a byte to be 8-bits and unsigned. But, within the entire
context of the C standard, I'd say it's incorrect, but taking into account
lines above one has to say it's correct within just the context of the
example.

> > Let's say the C byte is 9-bits because the hardware byte is 9-bits. But,
> > the C char is 8-bits. How do you access the ninth bit of the C byte in
C?
> > (You can't.) The byte is the addressable unit which can address 9-bits
both
> > in C and on hardware. But, the char is the "value unit" which can only
> > access 8 of those 9-bits. I.e., your "byte must fit in a char" doesn't
work
> > in this legal example. This is entirely *independent* of whether you
think
> > I'm "conflating" C bytes and hardware bytes.
>
> If the ninth bit is completely inaccessible, it's not meaningful from
> the perspective of the C program, no?
>
> If the C byte has extra bits in it, how do they affect the C program?
> If they do not, then they may as well not exist.

So? The fact that the bit has no effect doesn't change how a C byte was
defined.

> I didn't choose the resurrect a long forgotten conversation with you.

Ok.

> In what sense are the functions of setjmp and longjmp ambiguous?

Where they are useable depends on the implementation. Since the
implementation can't be standardized due to different hardware, setjmp and
longjmp can't be completely standardized either. What can't be
standardized, can't be abstracted.

> A couple I've already addressed, and the others
> (offsetof, for example) are quite possible to portably, if you observe
> the specified restrictions.

The C Rationale "suggests" four methods. I.e., offsetof is non-portable.
The closest "anyone" has come to a portable definition is X11's
definition... IIRC (I can't find the code at the moment), it has one basic
macro which works for many systems, but then has a number of other custom
macro's.

> > > Again you insist on conflating the C notion of a byte and the hardware
> > > concept.
> >
> > No.
>
> Ah, so you concede my point, since this is obviously a typo, and you
> meant "yes"...

Funny, but still wrong...

> > > The C standard does not talk about the implementation, it defines what
> > > a C program (and thus a C compiler, aka implementation) needs to
> > > appear to do. It makes those definitions in terms of an abstract
> > > machine, within which it defines various local terms like "byte". The
> > > implementation maps that onto real (for some definition of real)
> > > hardware. You appear to have some fundamental objection to that state
> > > of affairs,
> >
> > I think you didn't fully understand what you read...
>
> Someone certainly isn't.

I take this to imply you mean me... I assure you it isn't.

> I'm not sure this is worth pursuing.

What did I say over a year ago?

> You appear fully convinced of
> your position,

No. I'm not convinced of my position. As I see it, I did not take a
position, but I only explained the quotes you presented. "Taking a
position" implies I'm being argumentative for arguments sake. I clearly
stated that I what I originally presented from memory was incorrect. I only
explained the quotes you presented without adding further information from
the spec. (likely to support me). You are overwhelmed with just that. I do
believe my understanding is correct and represents the truth.

> to the point where you appear to be willing to make
> obviously false and illogical statements.

No. I did never such AFAIK. I believe did demonstrate that you did so on
more than a few occasions.

> Conversely, you accuse me
> of doing the same.

Accuse? No. Prove? Yes.

> If you think either position might be changed, I
> am willing to continue for a reasonable time.

The fact you weren't able to comprehend or accept the truth when shown to
you may have been the reason I chose to withdraw previously...


Rod Pemberton


From: Glen Herrmannsfeldt on
robertwessel2(a)yahoo.com wrote:
> On Dec 10, 1:25 am, "H. Peter Anvin" <h...(a)zytor.com> wrote:
>>robertwess...(a)yahoo.com wrote:
(snip)

>>>But my point was that a 9-bit hardware byte does not *require* a 9-bit
>>>C char/byte, although, of course, that would be the most likely
>>>implementation. Required != common != allowed.

>>Actually, there are combinations of other requirements that make an
>>8-bit byte on such a machine at least very difficult to achieve. C does
>>require that all objects can be copied as an array of bytes, so, for
>>example, packing four 8-bit bytes into a 36-bit word, but having a
>>36-bit int, would not be permitted, because of the leftover bits.

> Absolutely correct. If you did such a thing you'd likely end up with
> 32 bit ints (in fact the likely justification for such a thing, code
> portability with more common architectures, would presumably demand
> that), and you'd basically be using only 32 bits out of each hardware
> word (or packing 9 8-bit characters into a pair of 36 bit words). But
> while that's ugly (and undoubtedly slow), it's not really complicated
> in any way.

You might get away with a 32 bit int on the PDP-10 (one of the
more popular 36 bit machines in recent years), but it would be
somewhat harder for float.

> Even higher on the perversity scale, a 36 bit int *is* possible, but
> would require 5 (8-bit) bytes, and four pad bits (or more bytes, of
> course). Actually 8 bit chars, and 36 bit ints (and possibly 72 bit
> long longs), required to be aligned on 9 (!!) byte (8-bit) boundaries,
> might actually make a certain amount of sense...

You keep trying to make it hard. Four 9 bit bytes per word works
just fine. There ARE C compilers for the PDP-10, and I believe
that is what they use.

It does get interesting as the text file format for the DEC
operating systems puts five ASCII characters in a 36 bit word.
Text file I/O has to convert between that and the internal format.
Converting seven bit ASCII to either 8 or 9 bit char is about as
easy either way.

> My point was not that this was a good idea or a likely implementation,
> but that there's not a particular requirement (other than general
> sanity) that the native hardware sizes match in any sort of convenient
> way the types as visible from a C program.

Users will expect it, but yes it isn't required. If there is a
large performance penalty for using a certain size, though, it
will be discouraged.

> An analogous situation would occur if you wanted to implement Java on
> a 36 bit machine. Except then you wouldn't even have the option of
> going to 9/18/36/72 bit types (the types all being exact sizes in
> Java).

Java is very different from C. Note, for example, that Java doesn't
have the requirement of being able to copy data as arrays of
unsigned char. Even more, requiring IEEE floating point removes
the requirements based on the native floating point format.
(Not IEEE for the PDP-10).

-- glen

First  |  Prev  |  Next  |  Last
Pages: 2 3 4 5 6 7 8 9 10 11 12 13
Prev: Win32 non blocking console input?
Next: hugi compo #29