PHP + PostgreSQL: invalid byte sequence for encoding"UTF8" [Databases]

Prev: odbc problem
Next: Retreving X, Y, Z from the Geometry column in oracle 10g

From: aldnin on 21 Jul 2007 10:04

> My guess is that your PHP is not setup to handle UTF8, and is really
> sending something else. UTF8 is the default client encoding because that
> is the encoding of the database. It does not mean that PHP has set the
> right one. Before running your test, try executing this: "SET
> client_encoding TO LATIN1;" and see if that fixes it.

I already did this and all encoding settings are right, but I figured out something more.

1) Using pg_query for fetching UTF8 data from database is working properly. Of course when I try to output it direclty then I get something like that as output "lacarrière" - but when I use utf8_decode() on the UTF8-bytes I get it the right way "lacarri�re".

2) I found another PHP application which is able to insert UTF8 data properly, phpPgAdmin, but it seems that it uses the ADODB-Layers for executing SQL-statements.
Well, the fact that phpPgAdmin runs on the same machine handling properly UTF8 data means that my PHP is well configurated handling UTF8.

3) When I add to my DB-Class utf8_encode() on the querystring I send to the database, it works properly, the insert is fine, so that's a temporary solution for my first problem.

4) When I get data from database I usually would have to do a utf8_decode on EVERY string which is fetched from database. So my solution is now, to handle all strings comming UTF8 from database as they are comming with UTF8-bytes, and really only then when I need to decode them I decode them for further use.

Problem:
--------
Just declaring the string 'lacarri�re' 10 millions times takes 5 seconds, when doing a utf8_encode() on it takes 13 seconds. So it needs 2-3 times more ressources when using always a utf8_encode() on a string, also when the string does not include special characters. And this ressources are also wasted when the strings don't need to be utf8-encoded.

Workaround:
-----------
To don't waste ressources you have to do a utf8_encode only when you "guess" that there might be special characters - have fun with that, but it's the only way I see to work properly with that special characters in combination with postgres.

From: aldnin on 21 Jul 2007 10:05

| Next | Last
Pages: 1 2
Prev: odbc problem
Next: Retreving X, Y, Z from the Geometry column in oracle 10g