From: joel garry on
On Aug 23, 12:19 pm, Ben <bal...(a)comcast.net> wrote:
> On Aug 23, 2:48 pm, sybra...(a)hccnet.nl wrote:
>
>
>
>
>
> > On Thu, 23 Aug 2007 07:02:10 -0700, Ben <bal...(a)comcast.net> wrote:
> > >I'm not saying it is feasible to have a database set to use US7ASCII
> > >as its character set. I'm simply saying that in the scenario that
> > >Sybrand listed, 1 database and 1 client both being set to us7ascii, I
> > >don't see the issue. UNLESS you introduce a client using a different
> > >character set.
>
> > Ok, again
>
> > Client set to US7ASCII
> > Database set to US7ASCII
> > You send an eight bit character.
> > Oracle sees 7 bit client character set, 7 bit server character set
> > --->
> > HEY, I DON'T HAVE TO CONVERT ANY CHARACTER.
> > What will happen if all of a sudden someone decides to export using 7
> > bit NLS_LANG and import into 8 bit database.
>
> > Please don't imply I'm making up fairy tales, I'm talking stories for
> > grown ups!!!!
> > REAL WORLD HORROR STORIES with customers getting GROSS!!!
>
> > And yes: this explanation is on Metalink!!!
>
> > --
>
> I'm not implying anything. I'm trying to understand.
>
> How do you insert an 8 bit character with a 7 bit client into a db
> with a 7 bit character set? Wouldn't that be a square peg round hole
> kind of thing? You of course wouldn't get the 8 bit character back out
> of the 7 bit db.

This is entirely dependent on what tool you are using to insert the
data. Sybrand is correct, Oracle simply doesn't check if the sets are
equal between client and server. The tool may understand the 8 bits
without checking Oracle settings. The tool may be able to get the 8
bits back out. exp will export the 8 bits if, again, you have the
client and server set to 7 bits. Then when you import into an 8 bit
server setting with the client set to 7 bits, you get the extremely
helpful conversion boning or deboning your data, as the case may be.

>
> > >You don't really have control over what character set all the clients
> > >connect with, do you? If you have a client that uses US7ASCII and they
> > >select then update based on results, you could potentially corrupt all
> > >your data. no?
>
> The example in Mr Kyte's book is what I am referring to in my original
> question of not being able to avoid corruption. How can you keep
> someone from setting their NLS_LANG to us7ascii and updating an 8 bit
> or multibyte character field? Anytime that happens you would get a
> replacement character wouldn't you

No, you might get a conversion, depends on whether the 7 bit character
is a subset of the 8 bit set, and what the 8 bit character set is.

Keeping people from setting their clients is close to impossible, what
you must do is set things up properly so they don't see something
wrong and start changing things.

jg
--
@home.com is bogus.
http://www.shirky.com/writings/group_enemy.html


From: Laurenz Albe on
sybrandb(a)hccnet.nl wrote:
>>2) If you (as misguided admins too often do) mistakenly set NLS_LANG
>> to the database character set, no character conversion AND NO
>> INTEGRITY CHECKS are performed and you can store all sort of garbage
>> in your database without even noticing. It will cause problems later on,
>> though. This is an Oracle bug in my opinion, although Oracle will
>> probably disagree with me on this.
>
> It's not a bug. Setting the database characterset to the character set
> of the client is, especially when the server O/S doesn't support it.
> Setting the client characterset to the characterset of the database,
> especially when the O/S doesn't support the database characterset is a
> bug too.
> Eventually it is a Mickeysoft bug as Mickeysoft is not supporting the
> correct ISO characterset.
> It is NOT an Oracle bug!

You got me wrong.

Of course it is not Oracle's bug if I set my NLS_LANG wrong.

But it is Oracle's bug (in my opinion) if I have set the client
character set to US7ASCII, insert a byte > 127 in a text field, and
neither get an error nor (as Oracle seems to prefer) have the byte
clandestinely converted to a question mark.

I claim that the missing check for incorrect characters is a bug.

Yours,
Laurenz Albe

PS: By "bug" I mean a software error, not a user mistake.
From: Martin T. on
Laurenz Albe wrote:
> sybrandb(a)hccnet.nl wrote:
>>> (...)
>
> You got me wrong.
>
> Of course it is not Oracle's bug if I set my NLS_LANG wrong.
>
> But it is Oracle's bug (in my opinion) if I have set the client
> character set to US7ASCII, insert a byte > 127 in a text field, and
> neither get an error nor (as Oracle seems to prefer) have the byte
> clandestinely converted to a question mark.
>
> I claim that the missing check for incorrect characters is a bug.
>

Amen. But still Oracle will probably tell you that it's a Feature, not a
Bug.
From: Ben on
On Aug 23, 6:22 pm, joel garry <joel-ga...(a)home.com> wrote:
> On Aug 23, 12:19 pm, Ben <bal...(a)comcast.net> wrote:
>
>
>
>
>
> > On Aug 23, 2:48 pm, sybra...(a)hccnet.nl wrote:
>
> > > On Thu, 23 Aug 2007 07:02:10 -0700, Ben <bal...(a)comcast.net> wrote:
> > > >I'm not saying it is feasible to have a database set to use US7ASCII
> > > >as its character set. I'm simply saying that in the scenario that
> > > >Sybrand listed, 1 database and 1 client both being set to us7ascii, I
> > > >don't see the issue. UNLESS you introduce a client using a different
> > > >character set.
>
> > > Ok, again
>
> > > Client set to US7ASCII
> > > Database set to US7ASCII
> > > You send an eight bit character.
> > > Oracle sees 7 bit client character set, 7 bit server character set
> > > --->
> > > HEY, I DON'T HAVE TO CONVERT ANY CHARACTER.
> > > What will happen if all of a sudden someone decides to export using 7
> > > bit NLS_LANG and import into 8 bit database.
>
> > > Please don't imply I'm making up fairy tales, I'm talking stories for
> > > grown ups!!!!
> > > REAL WORLD HORROR STORIES with customers getting GROSS!!!
>
> > > And yes: this explanation is on Metalink!!!
>
> > > --
>
> > I'm not implying anything. I'm trying to understand.
>
> > How do you insert an 8 bit character with a 7 bit client into a db
> > with a 7 bit character set? Wouldn't that be a square peg round hole
> > kind of thing? You of course wouldn't get the 8 bit character back out
> > of the 7 bit db.
>
> This is entirely dependent on what tool you are using to insert the
> data. Sybrand is correct, Oracle simply doesn't check if the sets are
> equal between client and server. The tool may understand the 8 bits
> without checking Oracle settings. The tool may be able to get the 8
> bits back out. exp will export the 8 bits if, again, you have the
> client and server set to 7 bits. Then when you import into an 8 bit
> server setting with the client set to 7 bits, you get the extremely
> helpful conversion boning or deboning your data, as the case may be.
>
>
>
> > > >You don't really have control over what character set all the clients
> > > >connect with, do you? If you have a client that uses US7ASCII and they
> > > >select then update based on results, you could potentially corrupt all
> > > >your data. no?
>
> > The example in Mr Kyte's book is what I am referring to in my original
> > question of not being able to avoid corruption. How can you keep
> > someone from setting their NLS_LANG to us7ascii and updating an 8 bit
> > or multibyte character field? Anytime that happens you would get a
> > replacement character wouldn't you
>
> No, you might get a conversion, depends on whether the 7 bit character
> is a subset of the 8 bit set, and what the 8 bit character set is.
>
> Keeping people from setting their clients is close to impossible, what
> you must do is set things up properly so they don't see something
> wrong and start changing things.
>
> jg
> --
> @home.com is bogus.http://www.shirky.com/writings/group_enemy.html- Hide quoted text -
>
> - Show quoted text -



Thank you for the explanation.

From: joel garry on
On Aug 28, 10:54 am, Ben <bal...(a)comcast.net> wrote:

>
> After reading this it sounds like Oracle is saying that conversion is
> a bad thing and that the database character set should be set the same
> as the client. This is the reason why there is so much confusion, at
> least on my part, on character sets. I know that you can't just take
> this aside from all the rest of the facts of needing multi byte
> characters and supporting different client character sets but the
> wording in the documentation is just bad or at the least confusing

You might also try searching the metalink knowledgebase for the term
characterset. You may wind up even more confused, but some of the
docs there are enlightening.

jg
--
@home.com is bogus.
What kind of system is this, anyways? http://www.pcmag.com/article2/0,1895,2176192,00.asp