Re: ILIKE vs indices

From: James Cloos <cloos(at)jhcloos(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: ILIKE vs indices
Date: 2012-12-29 10:23:31
Message-ID: m3vcblcg1v.fsf@carbon.jhcloos.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "TL" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

JC>> Is there any contraindication to recasting:
JC>> foo ILIKE 'bar'
JC>> into:
JC>> LOWER(foo) LIKE LOWER('bar')

TL> In some locales those are not equivalent, I believe, or at least
TL> shouldn't be. (What the current code actually does is a separate
TL> question.)

I see. After determining indexing based on th existance of an initial
fixed string, exluding anything matching isalpha(), it uses tolower(3)
and friends to do the actual match.

So my proposal wouldn't change what matches, but might make fixing any
bugs in what *should* match more difficult?

TL> In any case it's not obvious why LOWER rather than UPPER.

I suggested lower() because that matched all of the suggestions I found.
And as it turns out matches the current behaviour, too.

The footnote about adding explicit mention to the docs was expressly
because it is not otherwise obvious whether indices should use lower()
or upper().

I'll ask on one of the unicode lists whether there are any locales where
a case-insensitive match should be different than a case-preserving match
of tolower() vs tolower().

-JimC
--
James Cloos <cloos(at)jhcloos(dot)com> OpenPGP: 1024D/ED7DAEA6

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Farina 2012-12-29 12:21:56 Re: pg_stat_statements: calls under-estimation propagation
Previous Message Pavel Stehule 2012-12-29 08:14:23 Re: proposal: ANSI SQL 2011 syntax for named parameters