On 14 Feb 2012, at 18:28, Tom Lane wrote:
> Oh, I see the reason for this: the code in cclass() in regc_locale.c
> doesn't go further up than U+00FF, so no codes above that will be
> thought to be letters (or members of any other character class).
> Clearly we need to go further when we are dealing with UTF8.
> I'm not sure what a sane limit would be though.
The Basic Multilingual Plane goes up to FFFF:
In response to
pgsql-bugs by date
|Next:||From: Félix GERZAGUET||Date: 2012-02-15 17:37:22|
|Subject: Re: BUG #6452: psql: can't change client encoding from the
|Previous:||From: Duncan Rance||Date: 2012-02-15 09:18:56|
|Subject: Re: BUG #6457: Regexp not processing word (with special characters on ends) correctly (UTF-8) |