Re: Unicode support

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: - - <crossroads0000(at)googlemail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unicode support
Date: 2009-04-13 19:39:58
Message-ID: 49E3950E.8020800@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera wrote:
> - - wrote:
>
>
>> 1) Functions like char_length() or length() do NOT return the number
>> of characters (the manual says they do), instead they return the
>> number of code points.
>>
>
> I think you have client_encoding misconfigured.
>
> alvherre=# select length('á'::text);
> length
> --------
> 1
> (1 fila)
>
>
>

Umm, but isn't that because your encoding is using one code point?

See the OP's explanation w.r.t. canonical equivalence.

This isn't about the number of bytes, but about whether or not we should
count characters encoded as two or more combined code points as a single
char or not.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-04-13 19:45:17 Re: proposal: add columns created and altered to pg_proc and pg_class
Previous Message Kevin Grittner 2009-04-13 19:37:49 Re: Unicode support