From: Rick Rothstein on
Assuming any characters above ASCII 255 in a text string makes the text
non-English, then does something like this work (note that is a space
character after the exclamation point)?

If StringValue Like "*[! -" & chr$(255) & "]*" Then
' Non-English characters present
Else
' All text are English characters
End If

--
Rick (MVP - Excel)


"Jim Mack" <jmack(a)mdxi.nospam.com> wrote in message
news:uYM5FLDrKHA.728(a)TK2MSFTNGP04.phx.gbl...
> Jeff Johnson wrote:
>> "Phil Hunt" <aaa(a)aaa.com> wrote in message
>> news:Oc$RZFCrKHA.1796(a)TK2MSFTNGP02.phx.gbl...
>>
>>> Thanks. I basically have to examine the bit patterns to determine.
>>> I understand the ASCII, it is the Unicode I have some trouble
>>> with. I know it is 16 bits insteads of 8. But in VB/debug window,
>>> I have never been able to see a 16 bits character, maybe it does
>>> not display on the screen. Do you know what i am talking ?
>>> For the character 'A', how can I see the full 16 bits pattern in
>>> VB ?
>>
>> I believe you can use the AscW() function to find this. If you get
>> a value back > 255, I'd say you can safely assume it's a
>> non-English character.
>
> Not even close. To prove that, in the Immediate Window:
>
> For Idx = 128 to 160: ? Idx, AscW(Chr(Idx)) :Next
>
> --
> Jim Mack
> Twisted tees at http://www.cafepress.com/2050inc
> "We sew confusion"

From: Jim Mack on
Phil Hunt wrote:
> Ok, I 'll make it > 128

Maybe you missed the point. There are quite a few "English Characters"
for which AscW() will return results > 128, or 255, etc.

The same is true of any Ansi character set under Windows, both SBCS
and MBCS.

--
Jim Mack
Twisted tees at http://www.cafepress.com/2050inc
"We sew confusion"


>
> "Jim Mack" <jmack(a)mdxi.nospam.com> wrote in message
> news:uYM5FLDrKHA.728(a)TK2MSFTNGP04.phx.gbl...
>> Jeff Johnson wrote:
>>> "Phil Hunt" <aaa(a)aaa.com> wrote in message
>>> news:Oc$RZFCrKHA.1796(a)TK2MSFTNGP02.phx.gbl...
>>>
>>>> Thanks. I basically have to examine the bit patterns to
>>>> determine. I understand the ASCII, it is the Unicode I have some
>>>> trouble with. I know it is 16 bits insteads of 8. But in
>>>> VB/debug window, I have never been able to see a 16 bits
>>>> character, maybe it does not display on the screen. Do you know
>>>> what i am talking ? For the character 'A', how can I see the
>>>> full 16 bits pattern in VB ?
>>>
>>> I believe you can use the AscW() function to find this. If you get
>>> a value back > 255, I'd say you can safely assume it's a
>>> non-English character.
>>
>> Not even close. To prove that, in the Immediate Window:
>>
>> For Idx = 128 to 160: ? Idx, AscW(Chr(Idx)) :Next
>>
>> --
>> Jim Mack
>> Twisted tees at http://www.cafepress.com/2050inc
>> "We sew confusion"

From: Phil Hunt on
Rick,
I think your code would work. I just have to look up 'Like', never used it,
but seems handy

Jim,
I think I missed your point. I used to memorize EBCDII codes. Since I move
over to PC, bit pattern is thing of the past for me, until now.

Thanks, I take a closer look Monday.

"Rick Rothstein" <rick.newsNO.SPAM(a)NO.SPAMverizon.net> wrote in message
news:uDC2NcDrKHA.732(a)TK2MSFTNGP06.phx.gbl...
> Assuming any characters above ASCII 255 in a text string makes the text
> non-English, then does something like this work (note that is a space
> character after the exclamation point)?
>
> If StringValue Like "*[! -" & chr$(255) & "]*" Then
> ' Non-English characters present
> Else
> ' All text are English characters
> End If
>
> --
> Rick (MVP - Excel)
>
>
> "Jim Mack" <jmack(a)mdxi.nospam.com> wrote in message
> news:uYM5FLDrKHA.728(a)TK2MSFTNGP04.phx.gbl...
>> Jeff Johnson wrote:
>>> "Phil Hunt" <aaa(a)aaa.com> wrote in message
>>> news:Oc$RZFCrKHA.1796(a)TK2MSFTNGP02.phx.gbl...
>>>
>>>> Thanks. I basically have to examine the bit patterns to determine.
>>>> I understand the ASCII, it is the Unicode I have some trouble
>>>> with. I know it is 16 bits insteads of 8. But in VB/debug window,
>>>> I have never been able to see a 16 bits character, maybe it does
>>>> not display on the screen. Do you know what i am talking ?
>>>> For the character 'A', how can I see the full 16 bits pattern in
>>>> VB ?
>>>
>>> I believe you can use the AscW() function to find this. If you get
>>> a value back > 255, I'd say you can safely assume it's a
>>> non-English character.
>>
>> Not even close. To prove that, in the Immediate Window:
>>
>> For Idx = 128 to 160: ? Idx, AscW(Chr(Idx)) :Next
>>
>> --
>> Jim Mack
>> Twisted tees at http://www.cafepress.com/2050inc
>> "We sew confusion"
>


From: Helmut Meukel on
Phil,

just run charmap.exe
It shows the hex code of the selected character.
I looked at the Win2000 and the Vista version, in the Vista version
you can select more DOS code pages (extended ASCII).
Both show you Unicode and a variety of Windows code pages (ANSI).

AFAIK, when looking at a non-Unicode text file, you have to _know_
what code page was used to create it. IBM and Micro$oft introduced
code pages with DOS 4.0 but forgot to define anything to distinguish
between text coded with different code pages. Same is true for ANSI.

In Unicode texts with western characters the high byte is usually
Hex00. Hex00FE is the lowercase icelandic character Thom.
The netherlandic ij is usually written as 2 characters, but in Unicode
you can use a single character Hex0133.
The trademark sign TM is Hex2122, the %o sign is Hex2030, the
Peseta sign Pts is Hex20A7, c/o is Hex2105, the danish/norvegian
A/S (Aktieselskab) is Hex214D.

HTH

Helmut.



"Phil Hunt" <aaa(a)aaa.com> schrieb im Newsbeitrag
news:Oc$RZFCrKHA.1796(a)TK2MSFTNGP02.phx.gbl...
> Thanks. I basically have to examine the bit patterns to determine.
> I understand the ASCII, it is the Unicode I have some trouble with. I know it
> is 16 bits insteads of 8. But in VB/debug window, I have never been able to
> see a 16 bits character, maybe it does not display on the screen. Do you know
> what i am talking ?
> For the character 'A', how can I see the full 16 bits pattern in VB ?
>
> "Helmut Meukel" <NoSpam(a)NoProvider.de> wrote in message
> news:uW6t4oBrKHA.4636(a)TK2MSFTNGP06.phx.gbl...
>>
>> "Phil Hunt" <aaa(a)aaa.com> schrieb im Newsbeitrag
>> news:OthUFOBrKHA.4220(a)TK2MSFTNGP05.phx.gbl...
>>> Ok. Forget French for a moment. How can i tell if the string contain
>>> "Eastern Asia" character ?
>>>
>>>
>>> "Jeff Johnson" <i.get(a)enough.spam> wrote in message
>>> news:eO%236mFBrKHA.5940(a)TK2MSFTNGP02.phx.gbl...
>>>> "Phil Hunt" <aaa(a)aaa.com> wrote in message
>>>> news:unthJABrKHA.1796(a)TK2MSFTNGP02.phx.gbl...
>>>>
>>>>> What is the best way to determine if a string contains "non Eglish"
>>>>> character ?
>>>>
>>>> That's not an easy question to answer. Consider the word "resum?" It's an
>>>> English word (taken from French) but it contains an accented character that
>>>> is not "native" to English. If your code encountered that word, would you
>>>> want it to judge that it contains a "non-English character"?
>>>>
>>>
>>
>> Let's start with the code table.
>> Characters in strings are just byte or integer values.
>> In old Dos ASCII (IIRC: American Standard Code for Information
>> Interchange) was used, 7 data bits + 1 parity bit.
>> IBM created Extended ASCII (8 data bits, no parity bit) and used
>> the doubled capacity to code some european characters and grafic
>> characters (card symbols, lines...).
>> This exteded ASCII became finally Code Page 437. Other code
>> pages like 850 (multilingual), 865 (scandinavian) used the same
>> code values for different characters. My first Vectra PC used the
>> Roman8 character set, also used by HP's 250, 1000 and 3000
>> Systems.
>> With Windows Microsoft switched to ANSI, still 8 bit and
>> finally to Unicode (16 bit).
>>
>> So first you have to know how your text is coded, to determine
>> which codes are used for eastern asian characters.
>>
>> HTH.
>>
>> Helmut.
>
>

From: Jim Mack on
Rick Rothstein wrote:
> Assuming any characters above ASCII 255 in a text string makes the
> text non-English, then does something like this work (note that is
> a space character after the exclamation point)?

You have to distinguish AscW() results from Asc() results. If you use
AscW(), you will see 'English' characters with codes > 255. Using
Asc() you won't, but you may then qualify some non-English characters
as English (which may be OK depending on the circumstance).

I don't know if Like examines the Unicode characters... if so, then it
will act the way AscW() does and fail some valid characters.

--
Jim


>
> If StringValue Like "*[! -" & chr$(255) & "]*" Then
> ' Non-English characters present
> Else
> ' All text are English characters
> End If
>
>
> "Jim Mack" <jmack(a)mdxi.nospam.com> wrote in message
> news:uYM5FLDrKHA.728(a)TK2MSFTNGP04.phx.gbl...
>> Jeff Johnson wrote:
>>> "Phil Hunt" <aaa(a)aaa.com> wrote in message
>>> news:Oc$RZFCrKHA.1796(a)TK2MSFTNGP02.phx.gbl...
>>>
>>>> Thanks. I basically have to examine the bit patterns to
>>>> determine. I understand the ASCII, it is the Unicode I have some
>>>> trouble with. I know it is 16 bits insteads of 8. But in
>>>> VB/debug window, I have never been able to see a 16 bits
>>>> character, maybe it does not display on the screen. Do you know
>>>> what i am talking ? For the character 'A', how can I see the
>>>> full 16 bits pattern in VB ?
>>>
>>> I believe you can use the AscW() function to find this. If you get
>>> a value back > 255, I'd say you can safely assume it's a
>>> non-English character.
>>
>> Not even close. To prove that, in the Immediate Window:
>>
>> For Idx = 128 to 160: ? Idx, AscW(Chr(Idx)) :Next
>>
>> --
>> Jim Mack
>> Twisted tees at http://www.cafepress.com/2050inc
>> "We sew confusion"

First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6
Prev: Send an email
Next: Weird mouse behavior