From: AlterEgo on
Hi all,

This was already posted in microsoft.public.sqlserver.programming, but no
answers since yesterday.

I have the task of evaluating the effort to convert a couple of web pages of
a large application to accept multiple languages. The web pages are data
driven. The application is old and we won't change the database structure
(char to nchar, varchar to nvarchar, etc.). I know the other issues with
localization (date formats, collation, etc.), but we are specifically
excluding them. This effort is going to be for only one of many clients that
use the same web site.

My question is: Which languages can SQL Server 2005 support with a
single-byte
byte character set?

Everything I've seen in BOL, MSDN and in general searches seems to blurs the
lines of what is supported with a one byte character set vs. Unicode.

TIA,
Bill


From: Russell Fields on
Bill,

Yes, it is hard to understand. But any column you choose will need to have
a collation and that collation will set the code page. The code page will
either support or not support the language. (I suppose that is all pretty
obvious.) A column can only have one collation, so that will also limit the
languages that you can support in that column.

So, for each column you must select a collation, which will limit the
languages that column will support. Another column can have another
collation, with a different set of languages. (But I don't know how
complicated you want to get.

The best list of language to collation to code page is in the SQL Server
Compact Edition document:
http://technet.microsoft.com/en-us/library/ms174596.aspx

You might also appreciate some of the warnings from other documents:
http://support.microsoft.com/kb/142867 -- Access, but the principles apply.
http://msdn.microsoft.com/en-us/library/aa214408(SQL.80).aspx
http://technet.microsoft.com/en-us/library/ms186356(SQL.100).aspx

Hope it helps,
RLF

"AlterEgo" <someone(a)dslextreme.com> wrote in message
news:O8S7s86qIHA.4904(a)TK2MSFTNGP03.phx.gbl...
> Hi all,
>
> This was already posted in microsoft.public.sqlserver.programming, but no
> answers since yesterday.
>
> I have the task of evaluating the effort to convert a couple of web pages
> of
> a large application to accept multiple languages. The web pages are data
> driven. The application is old and we won't change the database structure
> (char to nchar, varchar to nvarchar, etc.). I know the other issues with
> localization (date formats, collation, etc.), but we are specifically
> excluding them. This effort is going to be for only one of many clients
> that
> use the same web site.
>
> My question is: Which languages can SQL Server 2005 support with a
> single-byte
> byte character set?
>
> Everything I've seen in BOL, MSDN and in general searches seems to blurs
> the
> lines of what is supported with a one byte character set vs. Unicode.
>
> TIA,
> Bill
>


From: AlterEgo on
Russell,

Thank you for the very good and complete response. The links you provided
were most helpful. I have limited the scope of multi-language support to
storage, retrieval and display only. I have explicitely excluded full
sorting and searching capabilities. Given these caveats, does collation
matter?

I will be depending on the browser locale setting to determine which code
page to display.

Bill

"Russell Fields" <russellfields(a)nomail.com> wrote in message
news:ODiBzT7qIHA.4884(a)TK2MSFTNGP06.phx.gbl...
> Bill,
>
> Yes, it is hard to understand. But any column you choose will need to
> have a collation and that collation will set the code page. The code page
> will either support or not support the language. (I suppose that is all
> pretty obvious.) A column can only have one collation, so that will also
> limit the languages that you can support in that column.
>
> So, for each column you must select a collation, which will limit the
> languages that column will support. Another column can have another
> collation, with a different set of languages. (But I don't know how
> complicated you want to get.
>
> The best list of language to collation to code page is in the SQL Server
> Compact Edition document:
> http://technet.microsoft.com/en-us/library/ms174596.aspx
>
> You might also appreciate some of the warnings from other documents:
> http://support.microsoft.com/kb/142867 -- Access, but the principles
> apply.
> http://msdn.microsoft.com/en-us/library/aa214408(SQL.80).aspx
> http://technet.microsoft.com/en-us/library/ms186356(SQL.100).aspx
>
> Hope it helps,
> RLF
>
> "AlterEgo" <someone(a)dslextreme.com> wrote in message
> news:O8S7s86qIHA.4904(a)TK2MSFTNGP03.phx.gbl...
>> Hi all,
>>
>> This was already posted in microsoft.public.sqlserver.programming, but no
>> answers since yesterday.
>>
>> I have the task of evaluating the effort to convert a couple of web pages
>> of
>> a large application to accept multiple languages. The web pages are data
>> driven. The application is old and we won't change the database structure
>> (char to nchar, varchar to nvarchar, etc.). I know the other issues with
>> localization (date formats, collation, etc.), but we are specifically
>> excluding them. This effort is going to be for only one of many clients
>> that
>> use the same web site.
>>
>> My question is: Which languages can SQL Server 2005 support with a
>> single-byte
>> byte character set?
>>
>> Everything I've seen in BOL, MSDN and in general searches seems to blurs
>> the
>> lines of what is supported with a one byte character set vs. Unicode.
>>
>> TIA,
>> Bill
>>
>
>


From: Russell Fields on
Bill,

If you never use SQL Server to sort or search the character strings, then
the collation will not matter much to you. You might, for example, choose to
do an alphabetic sorting on the web pages.

I have to admit that I am so embedded into using the SQL Server features, I
find it hard to believe that it will have no impact at all. (E.g. creating a
unique index on character strings is affected by the collation.) So, if you
want code page 1252, I would suggest use Latin1_General_BIN2 (or the older
Latin1_General_BIN2) collation to define your use of the code page. One of
those should result in pretty much 'hands off' behavior for your data.

If you want to use code pages 850 or 437 there are
SQL_Latin1_General_CP(number)_BIN* collations.

FWIW,
RLF


"AlterEgo" <someone(a)dslextreme.com> wrote in message
news:%23uP4Zx7qIHA.4560(a)TK2MSFTNGP03.phx.gbl...
> Russell,
>
> Thank you for the very good and complete response. The links you provided
> were most helpful. I have limited the scope of multi-language support to
> storage, retrieval and display only. I have explicitely excluded full
> sorting and searching capabilities. Given these caveats, does collation
> matter?
>
> I will be depending on the browser locale setting to determine which code
> page to display.
>
> Bill
>
> "Russell Fields" <russellfields(a)nomail.com> wrote in message
> news:ODiBzT7qIHA.4884(a)TK2MSFTNGP06.phx.gbl...
>> Bill,
>>
>> Yes, it is hard to understand. But any column you choose will need to
>> have a collation and that collation will set the code page. The code
>> page will either support or not support the language. (I suppose that is
>> all pretty obvious.) A column can only have one collation, so that will
>> also limit the languages that you can support in that column.
>>
>> So, for each column you must select a collation, which will limit the
>> languages that column will support. Another column can have another
>> collation, with a different set of languages. (But I don't know how
>> complicated you want to get.
>>
>> The best list of language to collation to code page is in the SQL Server
>> Compact Edition document:
>> http://technet.microsoft.com/en-us/library/ms174596.aspx
>>
>> You might also appreciate some of the warnings from other documents:
>> http://support.microsoft.com/kb/142867 -- Access, but the principles
>> apply.
>> http://msdn.microsoft.com/en-us/library/aa214408(SQL.80).aspx
>> http://technet.microsoft.com/en-us/library/ms186356(SQL.100).aspx
>>
>> Hope it helps,
>> RLF
>>
>> "AlterEgo" <someone(a)dslextreme.com> wrote in message
>> news:O8S7s86qIHA.4904(a)TK2MSFTNGP03.phx.gbl...
>>> Hi all,
>>>
>>> This was already posted in microsoft.public.sqlserver.programming, but
>>> no answers since yesterday.
>>>
>>> I have the task of evaluating the effort to convert a couple of web
>>> pages of
>>> a large application to accept multiple languages. The web pages are data
>>> driven. The application is old and we won't change the database
>>> structure
>>> (char to nchar, varchar to nvarchar, etc.). I know the other issues with
>>> localization (date formats, collation, etc.), but we are specifically
>>> excluding them. This effort is going to be for only one of many clients
>>> that
>>> use the same web site.
>>>
>>> My question is: Which languages can SQL Server 2005 support with a
>>> single-byte
>>> byte character set?
>>>
>>> Everything I've seen in BOL, MSDN and in general searches seems to blurs
>>> the
>>> lines of what is supported with a one byte character set vs. Unicode.
>>>
>>> TIA,
>>> Bill
>>>
>>
>>
>
>


From: Dan Guzman on
Hi Russell.

> If you never use SQL Server to sort or search the character strings, then
> the collation will not matter much to you. You might, for example, choose
> to do an alphabetic sorting on the web pages.

Even with moving the sorting/comparison to the client application, I think
Bill might still have issues with round-trip character conversions with
single database collation and different code pages on the client. A
database collation code page like 457 will mitigate, but not prevent, the
mathematical problem of storing more than 128 different characters in a
single code page.

A kludge to completely address the code-page problem is to store raw
character data as binary data in the database and map to the desired
code-page on the client side without code page conversion.


CREATE TABLE dbo.CollationExample
(
Greek_BIN char(1) COLLATE Greek_CI_AS,
Hebrew_BIN char(1) COLLATE Hebrew_BIN2,
Latin1_General_BIN char(1) COLLATE Latin1_General_BIN2,
SQL_Latin1_General_CP437_BIN char(1) COLLATE
SQL_Latin1_General_CP437_BIN,
SQL_Latin1_General_CP850_BIN char(1) COLLATE
SQL_Latin1_General_CP850_BIN
)
GO

INSERT INTO dbo.CollationExample
--Greek letter Gamma
SELECT NCHAR(915),NCHAR(915),NCHAR(915),NCHAR(915),NCHAR(915)
--Hebrew letter Shin
UNION ALL SELECT
NCHAR(1513),NCHAR(1513),NCHAR(1513),NCHAR(1513),NCHAR(1513)

--display originally inserted values
SELECT NCHAR(915),NCHAR(915),NCHAR(915),NCHAR(915),NCHAR(915)
UNION ALL SELECT NCHAR(1513),NCHAR(1513),NCHAR(1513),NCHAR(1513),NCHAR(1513)

--display inserted values
SELECT * FROM dbo.CollationExample
GO

--
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/

"Russell Fields" <russellfields(a)nomail.com> wrote in message
news:ek4l%23F8qIHA.1768(a)TK2MSFTNGP03.phx.gbl...
> Bill,
>
> If you never use SQL Server to sort or search the character strings, then
> the collation will not matter much to you. You might, for example, choose
> to do an alphabetic sorting on the web pages.
>
> I have to admit that I am so embedded into using the SQL Server features,
> I find it hard to believe that it will have no impact at all. (E.g.
> creating a unique index on character strings is affected by the
> collation.) So, if you want code page 1252, I would suggest use
> Latin1_General_BIN2 (or the older Latin1_General_BIN2) collation to define
> your use of the code page. One of those should result in pretty much
> 'hands off' behavior for your data.
>
> If you want to use code pages 850 or 437 there are
> SQL_Latin1_General_CP(number)_BIN* collations.
>
> FWIW,
> RLF
>
>
> "AlterEgo" <someone(a)dslextreme.com> wrote in message
> news:%23uP4Zx7qIHA.4560(a)TK2MSFTNGP03.phx.gbl...
>> Russell,
>>
>> Thank you for the very good and complete response. The links you provided
>> were most helpful. I have limited the scope of multi-language support to
>> storage, retrieval and display only. I have explicitely excluded full
>> sorting and searching capabilities. Given these caveats, does collation
>> matter?
>>
>> I will be depending on the browser locale setting to determine which code
>> page to display.
>>
>> Bill
>>
>> "Russell Fields" <russellfields(a)nomail.com> wrote in message
>> news:ODiBzT7qIHA.4884(a)TK2MSFTNGP06.phx.gbl...
>>> Bill,
>>>
>>> Yes, it is hard to understand. But any column you choose will need to
>>> have a collation and that collation will set the code page. The code
>>> page will either support or not support the language. (I suppose that
>>> is all pretty obvious.) A column can only have one collation, so that
>>> will also limit the languages that you can support in that column.
>>>
>>> So, for each column you must select a collation, which will limit the
>>> languages that column will support. Another column can have another
>>> collation, with a different set of languages. (But I don't know how
>>> complicated you want to get.
>>>
>>> The best list of language to collation to code page is in the SQL Server
>>> Compact Edition document:
>>> http://technet.microsoft.com/en-us/library/ms174596.aspx
>>>
>>> You might also appreciate some of the warnings from other documents:
>>> http://support.microsoft.com/kb/142867 -- Access, but the principles
>>> apply.
>>> http://msdn.microsoft.com/en-us/library/aa214408(SQL.80).aspx
>>> http://technet.microsoft.com/en-us/library/ms186356(SQL.100).aspx
>>>
>>> Hope it helps,
>>> RLF
>>>
>>> "AlterEgo" <someone(a)dslextreme.com> wrote in message
>>> news:O8S7s86qIHA.4904(a)TK2MSFTNGP03.phx.gbl...
>>>> Hi all,
>>>>
>>>> This was already posted in microsoft.public.sqlserver.programming, but
>>>> no answers since yesterday.
>>>>
>>>> I have the task of evaluating the effort to convert a couple of web
>>>> pages of
>>>> a large application to accept multiple languages. The web pages are
>>>> data
>>>> driven. The application is old and we won't change the database
>>>> structure
>>>> (char to nchar, varchar to nvarchar, etc.). I know the other issues
>>>> with
>>>> localization (date formats, collation, etc.), but we are specifically
>>>> excluding them. This effort is going to be for only one of many clients
>>>> that
>>>> use the same web site.
>>>>
>>>> My question is: Which languages can SQL Server 2005 support with a
>>>> single-byte
>>>> byte character set?
>>>>
>>>> Everything I've seen in BOL, MSDN and in general searches seems to
>>>> blurs the
>>>> lines of what is supported with a one byte character set vs. Unicode.
>>>>
>>>> TIA,
>>>> Bill
>>>>
>>>
>>>
>>
>>
>
>