Re: Avoid full GIN index scan when possible

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Marc Cousin <cousinmarc(at)gmail(dot)com>
Subject: Re: Avoid full GIN index scan when possible
Date: 2019-06-28 22:54:23
Message-ID: 20190628225423.dgtyyux7tlk7p3fk@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 28, 2019 at 04:16:23PM -0400, Tom Lane wrote:
>Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> On Fri, Jun 28, 2019 at 03:03:19PM -0400, Tom Lane wrote:
>>> I not only don't want that function in indxpath.c, I don't even want
>>> it to be known/called from there. If we need the ability for the index
>>> AM to editorialize on the list of indexable quals (which I'm not very
>>> convinced of yet), let's make an AM interface function to do it.
>
>> Wouldn't it be better to have a function that inspects a single qual and
>> says whether it's "optimizable" or not? That could be part of the AM
>> implementation, and we'd call it and it'd be us messing with the list.
>
>Uh ... we already determined that the qual is indexable (ie is a member
>of the index's opclass), or allowed the index AM to derive an indexable
>clause from it, so I'm not sure what you envision would happen
>additionally there. If I understand what Julien is concerned about
>--- and I may not --- it's that the set of indexable clauses *as a whole*
>may have or lack properties of interest. So I'm thinking the answer
>involves some callback that can do something to the whole list, not
>qual-at-a-time. We've already got facilities for the latter case.
>

I'm not sure I understand the problem either.

I don't think "indexable" is the thing we care about here - in Julien's
original example the qual with '%a%' is indexable. And we probably want
to keep it that way.

The problem is that evaluating some of the quals may be inefficient with
a given index - but only if there are other quals. In Julien's example
it makes sense to just drop the '%a%' qual, but only when there are some
quals that work with the trigram index. But if there are no such 'good'
quals, it may be better to keep al least the bad ones.

So I think you're right we need to look at the list as a whole.

>> But that kinda resembles stuff we already have - selectivity/cost. So
>> why shouldn't this be considered as part of costing?
>
>Yeah, I'm not entirely convinced that we need anything new here.
>The cost estimate function can detect such situations, and so can
>the index AM at scan start --- for example, btree checks for
>contradictory quals at scan start. There's a certain amount of
>duplicative effort involved there perhaps, but you also have to
>keep in mind that we don't know the values of run-time-determined
>comparison values until scan start. So if you want certainty rather
>than just a cost estimate, you may have to do these sorts of checks
>at scan start.
>

Right, that's why I suggested doing this as part of costing, but you're
right scan start would be another option. I assume it should affect cost
estimates in some way, so the cost function would be my first choice.

But does the cost function really has enough info to make such decision?
For example, ignoring quals is valid only if we recheck them later. For
GIN that's not an issue thanks to the bitmap index scan.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2019-06-28 23:46:34 Re: [HACKERS] WAL logging problem in 9.4.3?
Previous Message Nikita Glukhov 2019-06-28 22:50:18 Re: Avoid full GIN index scan when possible