Re: spgist text_ops and LIKE

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Subject: Re: spgist text_ops and LIKE
Date: 2012-02-02 18:21:40
Message-ID: CA+TgmoZsY3z3Z9DSLCB5EZpEPUMt8eovhiaqCYidJXRyeqS+Vg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 2, 2012 at 12:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Wed, Feb 1, 2012 at 10:57 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Too lazy to look at the code right now, but I think indxpath.c contains
>>> hardwired assumptions that LIKE prefix optimizations are only possible
>>> with btree indexes.  Possibly it would be worth relaxing that.
>
>> Is it as simple as the attached?
>
> Yeah, that looks about right, with the single quibble that the other
> places where we have #defines in pg_opfamily.h put them after the
> corresponding DATA line not before.  Please fix and commit.

Done.

>> On the flip side, spg_text_inner_consistent()
>> seems to fall back to a full index-scan for the regular comparison
>> operators, which makes me wonder why anyone would bother having the
>> operator at all.
>
> MHO is that that code path is only useful when you are in C locale;
> but I think that's a useful enough case to be worth the rather small
> amount of incremental code to handle it.  I'm not sure I believe your
> speculations about making spgist handle inequality searches in other
> locales.  Non-C sort orders are too bizarre and we have too little
> visibility into the precise rules the current locale is following.

Unfortunately, I fear that you are right, unless we were to switch
over to using a locale machinery of our own devising in place of that
provided by the OS. And no, I'm not proposing that.

> ... however ... could it be reasonable to apply strxfrm and store the
> result of that in the index?  Then searches would just need byte-by-byte
> comparisons.

Yeah, you could, but I don't see the point. I was hoping that spgist
would give us the ability to have a single index that allows both
inequality comparisons and pattern matching. If you're going to have
one index where the strings are stored "straight" and another where
strxfrm() is applied, then you could just build two btree indexes, one
the regular way and a second with text_pattern_ops.

I'm having trouble figuring out under what set of circumstances spgist
is expected to be the best available alternative. It only supports a
small subset of the data types that GiST does, so I suppose the point
is that it should be faster for the cases that it does handle. And,
considering that this is all brand-new code, the fact that it's almost
keeping pace with btree on both pattern-matching and equality
comparisons is certainly respectable -- but I so far haven't found any
cases where it's a clear win. There's limited glory in being the
almost-fastest way of indexing for a certain class of queries.
Admittedly, I haven't tried the point-in-box stuff yet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Duncan Rance 2012-02-02 18:28:12 Re: BUG #6425: Bus error in slot_deform_tuple
Previous Message Robert Haas 2012-02-02 18:07:08 Re: keywords in pg_hba.conf are case-sensitive?