|
From: Tom Lane on 19 Jun 2008 12:29 I've identified the cause of bug #4253: /* Trim trailing space */ while (*pbuf && !t_isspace(pbuf)) pbuf++; *pbuf = '\0'; At least on Macs, t_isspace is capable of returning "true" when pointed at the second byte of a 2-byte UTF8 character. This explains the report that the letter "�" has a problem when some other ones don't. Of course pbuf needs to be incremented using pg_mblen not just ++. I looked around for other occurrences of the same problem and found a couple. I also found occurrences of the same pattern for skipping whitespace: while (*s && t_isspace(s)) s++; This is safe if and only if t_isspace is never true for multibyte characters ... can anyone think of a counterexample? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 19 Jun 2008 13:23 I wrote: > This is safe if and only if t_isspace is never true for multibyte > characters ... can anyone think of a counterexample? Non-breaking space is a counterexample, so I pg_mblen-ified those loops too. Fortunately this code only executes during dictionary cache load, so a few extra cycles aren't too critical. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
|
Pages: 1 Prev: Backend Stats Enhancement Request Next: Plan targetlists in EXPLAIN output |