From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
Cc: | - - <crossroads0000(at)googlemail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Unicode support |
Date: | 2009-04-13 19:39:58 |
Message-ID: | 49E3950E.8020800@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Alvaro Herrera wrote:
> - - wrote:
>
>
>> 1) Functions like char_length() or length() do NOT return the number
>> of characters (the manual says they do), instead they return the
>> number of code points.
>>
>
> I think you have client_encoding misconfigured.
>
> alvherre=# select length('á'::text);
> length
> --------
> 1
> (1 fila)
>
>
>
Umm, but isn't that because your encoding is using one code point?
See the OP's explanation w.r.t. canonical equivalence.
This isn't about the number of bytes, but about whether or not we should
count characters encoded as two or more combined code points as a single
char or not.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2009-04-13 19:45:17 | Re: proposal: add columns created and altered to pg_proc and pg_class |
Previous Message | Kevin Grittner | 2009-04-13 19:37:49 | Re: Unicode support |