From: Rui Zhang on
Hi group,

I was trying to figure out the internal representation of the clob
type column to see if it is UTF8 encoded, but couldn't find any
functions (like dump function provided for varchar2 type colomn) or
packages. Here is the procedure I used, but I am not sure if it gives
me the actual internal representation. I read the clob to a varchar2
variable and dump it to hex code. Thank you!

DECLARE
i1 CLOB;
len NUMBER;
my_vr varchar2(1000);
my_vr2 varchar2(1000);
BEGIN
SELECT action_log
INTO i1
FROM table1
WHERE id = '742';

len := dbms_lob.getlength(i1);
dbms_output.put_line('Column Length: ' || TO_CHAR(len));
dbms_lob.read(i1, len,1,my_vr);

select dump(my_vr, 16)
into my_vr2
from dual;

dbms_output.put_line(my_vr2);
END;
/
From: ddf on
On Mar 16, 9:30 am, Rui Zhang <jackcha...(a)gmail.com> wrote:
> Hi group,
>
> I was trying to figure out the internal representation of the clob
> type column to see if it is UTF8 encoded, but couldn't find any
> functions (like dump function provided for varchar2 type colomn) or
> packages. Here is the procedure I used, but I am not sure if it gives
> me the actual internal representation. I read the clob to a varchar2
> variable and dump it to hex code. Thank you!
>
> DECLARE
>  i1    CLOB;
>  len   NUMBER;
>  my_vr varchar2(1000);
>  my_vr2 varchar2(1000);
> BEGIN
>   SELECT action_log
>   INTO i1
>   FROM table1
>   WHERE id = '742';
>
>   len := dbms_lob.getlength(i1);
>   dbms_output.put_line('Column Length: ' || TO_CHAR(len));
>   dbms_lob.read(i1, len,1,my_vr);
>
>   select dump(my_vr, 16)
>   into my_vr2
>   from dual;
>
>   dbms_output.put_line(my_vr2);
> END;
> /

What national character set are you using for this database? CLOB and
VARCHAR2 columns (among others) use the 'normal' character set defined
at database creation and the N-named counterparts (NCLOB, NVARCHAR2)
use UTF-8 (or whichever multi-byte characterset is defined as the
NLS_NCHAR_CHARACTERSET). If NLS_CHARACTERSET is not defined as UTF8
or one of the other UTF-8 compatible variants then your CLOBs are not
likely to be using UTF-8.


David Fitzjarrell
From: Michel Cadot on

"Rui Zhang" <jackchang1(a)gmail.com> a �crit dans le message de news:
02013ebf-b54d-4782-8968-3fedbd9d6414(a)30g2000yqi.googlegroups.com...
| Hi group,
|
| I was trying to figure out the internal representation of the clob
| type column to see if it is UTF8 encoded, but couldn't find any
| functions (like dump function provided for varchar2 type colomn) or
| packages. Here is the procedure I used, but I am not sure if it gives
| me the actual internal representation. I read the clob to a varchar2
| variable and dump it to hex code. Thank you!
|
| DECLARE
| i1 CLOB;
| len NUMBER;
| my_vr varchar2(1000);
| my_vr2 varchar2(1000);
| BEGIN
| SELECT action_log
| INTO i1
| FROM table1
| WHERE id = '742';
|
| len := dbms_lob.getlength(i1);
| dbms_output.put_line('Column Length: ' || TO_CHAR(len));
| dbms_lob.read(i1, len,1,my_vr);
|
| select dump(my_vr, 16)
| into my_vr2
| from dual;
|
| dbms_output.put_line(my_vr2);
| END;
| /

Since 10g CLOB are stored in AL16UTF16 if your character set is a multi-byte one
otherwise it is in the database character set itself.

Regards
Michel


From: Rui Zhang on
On Mar 16, 9:39 am, "Michel Cadot" <micadot{at}altern{dot}org> wrote:
> "Rui Zhang" <jackcha...(a)gmail.com> a écrit dans le message de news:
> 02013ebf-b54d-4782-8968-3fedbd9d6...(a)30g2000yqi.googlegroups.com...
> | Hi group,
> |
> | I was trying to figure out the internal representation of the clob
> | type column to see if it is UTF8 encoded, but couldn't find any
> | functions (like dump function provided for varchar2 type colomn) or
> | packages. Here is the procedure I used, but I am not sure if it gives
> | me the actual internal representation. I read the clob to a varchar2
> | variable and dump it to hex code. Thank you!
> |
> | DECLARE
> | i1    CLOB;
> | len   NUMBER;
> | my_vr varchar2(1000);
> | my_vr2 varchar2(1000);
> | BEGIN
> |  SELECT action_log
> |  INTO i1
> |  FROM table1
> |  WHERE id = '742';
> |
> |  len := dbms_lob.getlength(i1);
> |  dbms_output.put_line('Column Length: ' || TO_CHAR(len));
> |  dbms_lob.read(i1, len,1,my_vr);
> |
> |  select dump(my_vr, 16)
> |  into my_vr2
> |  from dual;
> |
> |  dbms_output.put_line(my_vr2);
> | END;
> | /
>
> Since 10g CLOB are stored in AL16UTF16 if your character set is a multi-byte one
> otherwise it is in the database character set itself.
>
> Regards
> Michel

the query result is AL32UTF8
SELECT value FROM nls_database_parameters WHERE
parameter='NLS_CHARACTERSET'

Does this mean the CLOB is saved in UTF8?
when I try to work on column(convert it to a json object), I got a
malformed UTF8 error.
From: Michel Cadot on

"Rui Zhang" <jackchang1(a)gmail.com> a �crit dans le message de news:
3104676a-70a3-4a0b-b711-b998238285b2(a)x12g2000yqx.googlegroups.com...
On Mar 16, 9:39 am, "Michel Cadot" <micadot{at}altern{dot}org> wrote:
> "Rui Zhang" <jackcha...(a)gmail.com> a �crit dans le message de news:
> 02013ebf-b54d-4782-8968-3fedbd9d6...(a)30g2000yqi.googlegroups.com...
> | Hi group,
> |
> | I was trying to figure out the internal representation of the clob
> | type column to see if it is UTF8 encoded, but couldn't find any
> | functions (like dump function provided for varchar2 type colomn) or
> | packages. Here is the procedure I used, but I am not sure if it gives
> | me the actual internal representation. I read the clob to a varchar2
> | variable and dump it to hex code. Thank you!
> |
> | DECLARE
> | i1 CLOB;
> | len NUMBER;
> | my_vr varchar2(1000);
> | my_vr2 varchar2(1000);
> | BEGIN
> | SELECT action_log
> | INTO i1
> | FROM table1
> | WHERE id = '742';
> |
> | len := dbms_lob.getlength(i1);
> | dbms_output.put_line('Column Length: ' || TO_CHAR(len));
> | dbms_lob.read(i1, len,1,my_vr);
> |
> | select dump(my_vr, 16)
> | into my_vr2
> | from dual;
> |
> | dbms_output.put_line(my_vr2);
> | END;
> | /
>
> Since 10g CLOB are stored in AL16UTF16 if your character set is a multi-byte one
> otherwise it is in the database character set itself.
>
> Regards
> Michel

the query result is AL32UTF8
SELECT value FROM nls_database_parameters WHERE
parameter='NLS_CHARACTERSET'

Does this mean the CLOB is saved in UTF8?
when I try to work on column(convert it to a json object), I got a
malformed UTF8 error.

-----------------------------

As I said, in this case CLOB are stored in AL16UTF16.
CLOB was never stored in UTF8.

Regards
Michel