From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Regexps vs. locale |
Date: | 2009-01-07 04:44:24 |
Message-ID: | 200901070444.n074iOM19932@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Added to TODO:
Add ability to use case-insensitive regular expressions on multi-byte
characters
ILIKE already works with multi-byte characters
* http://archives.postgresql.org/pgsql-hackers/2008-12/msg00433.php
---------------------------------------------------------------------------
Andrew Gierth wrote:
> This came up on irc:
>
> postgres=# show lc_ctype;
> lc_ctype
> -------------
> fr_FR.UTF-8
>
> postgres=# show server_encoding;
> server_encoding
> -----------------
> UTF8
> (1 row)
>
> postgres=# select E'\303\201' ILIKE E'\303\241';
> ?column?
> ----------
> t
> (1 row)
>
> postgres=# select E'\303\201' ~* E'\303\241';
> ?column?
> ----------
> f
> (1 row)
>
> Obviously, this happens because the locale support functions in
> backend/regex/regc_locale.c are (presumably intentionally) crippled so
> as not to support non-ascii chars, despite all the code there using
> wide chars for everything otherwise.
>
> Why is this? It does not appear to be a documented restriction.
>
> --
> Andrew (irc:RhodiumToad)
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2009-01-07 04:47:14 | Re: Multiplexing SUGUSR1 |
Previous Message | Bruce Momjian | 2009-01-07 04:25:54 | Re: log output of vxid |