Re: Regexp match with accented character problem

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Laslo Forro <getforum(at)gmail(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Regexp match with accented character problem
Date: 2010-06-08 13:53:02
Message-ID: 3963.1276005182@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Laslo Forro <getforum(at)gmail(dot)com> writes:
> It seems that accented characters are not recognized as \w.

Just FYI, that's a known problem with the regex operators if you're
using UTF8 database encoding (or more generally, any multibyte encoding,
but UTF8 is usually the one people complain about). I don't believe
updating to 8.4 would have fixed it for you --- maybe the reason the
problem went away is you switched to a different encoding, such as one
of the LATINn family?

There is a tentative fix in 9.0, FWIW.

regards, tom lane

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Jon Jensen 2010-06-08 15:35:14 Re: The Two Towers
Previous Message Thom Brown 2010-06-08 13:12:57 Re: Regexp match with accented character problem