Re: Backpatching of "Teach the regular expression functions to do case-insensitive matching"

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Backpatching of "Teach the regular expression functions to do case-insensitive matching"
Date: 2011-05-07 02:35:35
Message-ID: BANLkTinX+SKFkrFfmDAF_MjgGf88YSoeCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 6, 2011 at 9:22 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On Friday, May 06, 2011 04:30:01 AM Robert Haas wrote:
>> On Thu, May 5, 2011 at 5:21 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> > In my opinion this is actually a bug in < 9.0. As its a (imo) low impact
>> > fix thats constrained to two files it seems sensible to backpatch it now
>> > that the solution has proven itself in the field?
>> > The issue is hard to find and has come up several times in the field. And
>> > it has been slightly embarassing more than once ;)
>> Can you share some more details about your experiences?
> About the embarassing or hard to find part?
>
> One of the hard to find part parts involved a search (constraining word order
> after a tsearch search) where slightly fewer than usual search results were
> returned in production.
> Nobody had noticed during testing that case insensitive search worked for most
> things except multibyte chars as the tested case was something like: SELECT
> 'ÖFFENTLICHKEIT' ~* 'Öffentlichkeit' and the regex condition was only relevant
> when searching for multiple words.
>
> One of the emarassing examples was that I suggested moving away from a
> solution using several ILIKE rules to one case insenitive regular expression.
> Totally forgetting that I knew that this was only fixed in 9.0. This turned out
> to be faster. And it turned out to be wrong. In production :-(.
>
>
> Both sum up that the problem is often not noticed as most of the people
> realizing that that case could be a problem don't have a knowledge of the
> content and don't notice the problem until later...

After mulling this over a bit more, I guess I''m a little skeptical of
back-patching this because it is clearly a behavior change. It seems
unlikely, but not impossible, that someone is relying on the current
behavior, and changing it in a minor release might be considered
unfriendly.

On the flip side, the risk of it flat-out blowing up seems pretty
small. For someone to invent their own version of wchar_t that uses
something other than Unicode code points would be pretty much pure
masochism, wouldn't it?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dan Ports 2011-05-07 02:49:22 Re: patch: fix race in SSI's CheckTargetForConflictsIn
Previous Message Robert Haas 2011-05-07 02:26:38 Re: GSoC 2011 - New phpPgAdmin Plugin Architecture