Is hex an ascii thing? [ASM]

Prev: DIV overflow
Next: RIP relative adresses

From: robertwessel2 on 12 Apr 2007 06:07

On Apr 12, 4:29 am, "[Jongware]" <sorry(a)no_spam.plz> wrote:
> Does this mean "Unicode is not compatible with programming"? Nah. Suppose --
> I'm not aware of any -- there is a Unicode-compliant compiler, which can
> read and parse Unicode sources. You still would have to use '0' to '9' for
> numbers, and 'n', 'e', and 'w' characters for the command "new", but your
> comments and your variables might contain any character in Greek, Thai, or
> Telugu you want.

That would be Java.

> But here comes the caveat: The practical problem is the set
> of "characters", "whitespace" and "delimiters" in the compiler has to be
> revised. There are a number of different 'official' white space characters
> defined in Unicode, for example, the Chinese set has its own "fixed width"
> space -- are all of these equal to (ASCII) character 32? If so, how about
> the circled numerals in the dingbat section? Are these numbers? (There is no
> circled '0' -- there is a circle '10', which leads to all new kind of
> horrors in parsing. There are also glyphs for Roman numerals, which just
> _may_ be too much to interpret...)
> And if you think _that_ far ahead, you might as well accept the single
> ideograph 5D2D "zhan" to equal the keyword "new" -- and now you're buggered
> again, as there is at least one other ideograph 5D84 "zhan" with exactly the
> same meaning.

Java keywords, punctuation, operators and digits for literals all come
from the Latin-1 group. But the allowed characters for names and
identifiers is pretty broad, although it is defined (mostly anything
other than the Unicode ranges for special characters).

From: robertwessel2 on 12 Apr 2007 06:20

On Apr 12, 5:07 am, "robertwess...(a)yahoo.com"
<robertwess...(a)yahoo.com> wrote:
> Java keywords, punctuation, operators and digits for literals all come
> from the Latin-1 group. But the allowed characters for names and
> identifiers is pretty broad, although it is defined (mostly anything
> other than the Unicode ranges for special characters).

To add a bit to that, Java source, as processed by the compiler is
always Unicode, but source files are often written in ASCII, which
must be converted first (that's done automatically if the file doesn't
start with the usual Unicode byte order mark).

So if you've got a Unicode source file (and text editor) you can write
"double _pi_=3.14;" (where _pi_ is the Unicode glyph for the Greek
letter). If you have an ASCII source file, you can write "double
\u03A0=3.14;" instead (of course that works in a Uncode source file as
well). If you're editing primarily in ASCII, you probably will want
to avoid anything not in Latin-1 for variable names.

(Please ignore the crappy approximation for pi.)

From: rhyde on 12 Apr 2007 15:35

On Apr 11, 7:14 pm, "Evenbit" <nbaker2...(a)charter.net> wrote:
> On Apr 11, 5:28 pm, "Jim Carlock" <anonym...(a)127.0.0.1> wrote:
>
>
>
>
>
> > "[Jongware]" wrote...
>
> > "Jim Carlock" posted...
> > : News reader: Outlook Express
> > : Greek Capital Letter Gamma...
> > : G
> > :
> > : Accessories\System Tools\Character Map.
> > : Set the Font to Arial.
> > : Scroll down to U+0393 Greek Capital Letter Gamma, select it
> > : by clicking on it, then Click on the Select button.
> > : Click on the Copy button.
> > :
> > : Not sure if this works or not. Will see.
>
> > Well it worked before I pressed the Send button. Seems to require
> > a Greek Newsgroup with HTML encoding to be able to handle the
> > extra characters.
>
> > 0, 1, 2, 3, 4, 5, 6, 7, 8, 9...
> > 10 ? ? ? ? ? ¤ ? ? ? ?
> > 20 ¶ § ? ? ? ? ? ? ? ?
> > 30 ? ? !
>
> I have to say that I am a tad disappointed. Knowing your interest
> level for "all things assembly" I assumed you would write some 'code'
> to help you conduct these tests. ;-)
>
> You give me no other choice but to plop some HLA fodder here for Rene
> (and gang) to feast:
>
> program chartest;
> #include( "stdlib.hhf" )
> // change line #23 below if you use StdLib 2.0
>
> static
> s :string;
> buff :byte[16];
>
> begin chartest;
>
> str.init( buff, 16 );
> mov( eax, s );
>
> stdout.put( " Dec Hex Char" nl );
> stdout.put( "----+----+----" nl );
> mov( 32, cl );
>
> double_ampersand:
>
> stdout.puts( " " );
> stdout.putu8( cl );
> stdout.puts( " " );
> conv.bToStr( cl, 2, ' ', s ); // in StdLib 2.0 use byteToHex()

Actually, that would be conv.h8ToStr :-)
hLater,
Randy Hyde

From: [Jongware] on 12 Apr 2007 17:45

<robertwessel2(a)yahoo.com> wrote in message
news:1176373235.589035.43940(a)n76g2000hsh.googlegroups.com...
> On Apr 12, 5:07 am, "robertwess...(a)yahoo.com"
> <robertwess...(a)yahoo.com> wrote:
> > Java keywords, punctuation, operators and digits for literals all come
> > from the Latin-1 group. But the allowed characters for names and
> > identifiers is pretty broad, although it is defined (mostly anything
> > other than the Unicode ranges for special characters).
>
> To add a bit to that, Java source, as processed by the compiler is
> always Unicode, but source files are often written in ASCII, which
> must be converted first (that's done automatically if the file doesn't
> start with the usual Unicode byte order mark).
>
> So if you've got a Unicode source file (and text editor) you can write
> "double _pi_=3.14;" (where _pi_ is the Unicode glyph for the Greek
> letter). If you have an ASCII source file, you can write "double
> \u03A0=3.14;" instead (of course that works in a Uncode source file as
> well). If you're editing primarily in ASCII, you probably will want
> to avoid anything not in Latin-1 for variable names.

Excellent description -- it seems they avoided the delimiter/numeral dichotomies
by just ignoring them :-)
Still to come: Number notation using *only* the Roman Numeral glyphs?

[Jw]

From: robertwessel2 on 12 Apr 2007 18:01

On Apr 12, 4:45 pm, "[Jongware]" <IdontWantS...(a)hotmail.com> wrote:
> > So if you've got a Unicode source file (and text editor) you can write
> > "double _pi_=3.14;" (where _pi_ is the Unicode glyph for the Greek
> > letter). If you have an ASCII source file, you can write "double
> > \u03A0=3.14;" instead (of course that works in a Uncode source file as
> > well). If you're editing primarily in ASCII, you probably will want
> > to avoid anything not in Latin-1 for variable names.
>
> Excellent description -- it seems they avoided the delimiter/numeral dichotomies
> by just ignoring them :-)
> Still to come: Number notation using *only* the Roman Numeral glyphs?

"double _pi_= xxii / vii;" - I like it.

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: DIV overflow
Next: RIP relative adresses