Re: Indexing of LIKE queries is broken in current sources

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Indexing of LIKE queries is broken in current sources
Date: 2000-09-14 14:46:16
Message-ID: 13906.968942776@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu> writes:
>> It seems that the parser now emits some kind of function call for LIKE
>> expressions, whereas the optimizer's code to use indexes for LIKE is
>> looking for an operator.

> Oh, that's bad. I changed it to the function call to allow
> three-parameter LIKE clauses, which is a perfectly reasonable thing to
> do imho.

Well, it's a clean solution in and of itself, but it fails to take into
account what the rest of the system is able or not able to do.

> This LIKE shortcircuit stuff is a hack anyway, but where should I look
> in the optimizer?

The tip of this iceberg is the "special index operator" hacks in
backend/optimizer/path/indxpath.c. However, the true dimensions of
the problem don't become apparent until you realize that *none* of
the optimizer does anything much with function calls as opposed to
operators. To get back up to the level of functionality we had for
LIKE before, you'd also need to devise a way of doing selectivity
estimation for function calls --- ie, define an API for function
estimators, add appropriate column(s) to pg_proc, then start actually
writing some code. This would be a good thing to do someday, but
I think we have considerably more pressing problems to deal with
for 7.1.

BTW, none of the code that actually bashes LIKE/regexp patterns in
indxpath.c/selfuncs.c knows anything about nonstandard ESCAPE characters
for patterns. It was this (and the code's lack of knowledge about
ILIKE) that I'd originally had as my to-do item --- it wasn't till I
realized you'd switched from operators to functions that I saw this was
more than a trivial problem to fix.

After a little bit of thought I think I see a way out that avoids
opening the function-selectivity can of worms. Let's translate LIKE
to an operator same as we always did (and add an operator for ILIKE).
The forms with an ESCAPE clause can be translated to the same operators
but with a righthand argument that is a call of a new function
escape_for_like(pattern, escape)
escape_for_like would interpret the pattern with the given escape
character and translate it into a pattern that uses the standard escape
character, ie backslash. After constant folding, this looks just like
the old style of LIKE call and none of the optimizer code has to change
at all, except to add a case for ILIKE which will be a trivial addition.
LIKE itself gets simpler too, since the pattern matcher needn't cope
with nonstandard escape characters. Seems like a good subdivision of
labor within LIKE.

Thoughts? I'm willing to take responsibility for making this happen
if you agree it's a good solution.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bryan Field-Elliot 2000-09-14 15:13:16 Show triggers in psql?
Previous Message Zeljko Trogrlic 2000-09-14 14:04:55 RE: [HACKERS] List funnies ?