From: Russell Fields on
Dan, Thanks for the comment and the very good example of the problem.

> Even with moving the sorting/comparison to the client application, I think
> Bill might still have issues with round-trip character conversions with
> single database collation and different code pages on the client.

Yes, I was assuming (and that is always a mistake, of course) that the
chosen code page in SQL Server would match the supported code page on the
web, which would then define the supported languages. By my unease is why I
also said, "I
find it hard to believe that it will have no impact at all." You have
helped me and I hope also helped Bill to more clearly understand the
problem.

RLF


From: Dan Guzman on
> Yes, I was assuming (and that is always a mistake, of course) that the
> chosen code page in SQL Server would match the supported code page on the
> web, which would then define the supported languages.

Let me share a related problem I coincidentally ran into a couple of hours
after my response. We have a classic asp application that supports
Traditional Chinese. Chinese strings were stored in an nvarchar column but
the non-parameterized insert/update statements didn't prefix the string
literals with N and the html META tag specified charset BIG5 (a non-unicode
DBCS character set).

This application has worked for years but a problem when some newly inserted
data did not display correctly. When I looked at the nvarchar data in the
database using SSMS, I saw that both old and new data contained mostly
special characters rather than Chinese characters. If I manually updated
the problem row from SSMS using an N literal, the data then looked good in
SSMS but didn't display correctly in the application.

The moral of this story is that translating data between codepages/charsets
can occur at many levels and involves compromises. We plan to rewrite the
application in .NET, convert the data to proper Unicode and ensure we use
Unicode end-to-end going forward. IMHO, Unicode (end-to-end) is the best
approach.

--
Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/

From: Russell Fields on
>> IMHO, Unicode (end-to-end) is the best

Very interesting account. And I totally agree on Unicode.

RLF


From: "Sylvain Lafontaine" sylvain aei ca on
The problem here is that the OP think that he will save a lot of time by
using ANSI with a mix of codepages and binary collations to save the trouble
of converting his application to Unicode for supporting various languages.
But the truth is that he will probably get exactly the opposite; ie., losing
an incredible amount of time trying to find his path trough the labyrinth of
codepages and binary collations.

While you can use single byte code pages to display nearly all languages and
that it should be relatively easy to do this if you limit yourself to a well
delimited client access (the web site in this case); I won't be surprised to
learn that the client(s) will want to support more than one single *foreign*
language after a few months and that after a few years (if it takes that
long), the application will have become a labyrinth of complexity.

It's incredible the number of days and weeks that I lost in the past trying
to fix some applications that was still using a single-byte code page
instead of Unicode and all these cases, there were only one single language
(french) involved; not many. All these applications using single-byte code
page work perfectly well when they are accessed by a single type of client
interface but from the minute that you want to start accessing them from
multiple points, they are doomed. Theoritically, they are not but
practically, they are; hence the difficulty of finding the right advice when
peering through the documentation.

To the OP: if you want to display multiples languages, using single-byte
code pages is like manipulating the Pandora's box and it shouldn't take very
long before the box get opened. You might want to save time and money by
keeping single-byte code pages instead of switching to Unicode but in the
long run, you will lose much more than the little that you will save at this
moment.

--
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: sylvain aei ca (fill the blanks, no spam please)


"Russell Fields" <russellfields(a)nomail.com> wrote in message
news:eHf6vArrIHA.1316(a)TK2MSFTNGP06.phx.gbl...
>>> IMHO, Unicode (end-to-end) is the best
>
> Very interesting account. And I totally agree on Unicode.
>
> RLF
>