Re: Unicode support

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "- -" <crossroads0000(at)googlemail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Unicode support
Date: 2009-04-13 19:37:49
Message-ID: 49E34E3D.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
>> 1) Functions like char_length() or length() do NOT return the
number
>> of characters (the manual says they do), instead they return the
>> number of code points.
>
> I think you have client_encoding misconfigured.
>
> alvherre=# select length('á'::text);
> length
> --------
> 1
> (1 fila)

The OP didn't say it returned the number of bytes. Since you found
that this character was stored in only two bytes, it must have been
one two-byte code point. I think storing it as two code points would
have taken at least three bytes (one for the letter and two for the
accent), no?

-Kevin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2009-04-13 19:39:58 Re: Unicode support
Previous Message Tom Lane 2009-04-13 19:31:22 Re: Regression failure on RHEL 4 w/ PostgreSQL 8.4 beta1