Re: Bitmap index scans use of filters on available columns

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bitmap index scans use of filters on available columns
Date: 2015-11-05 02:15:31
Message-ID: 563ABBC3.1050309@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 11/04/2015 11:32 PM, Tom Lane wrote:
> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
>> On Wed, Nov 4, 2015 at 7:14 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> You're missing my point: that is possible in an indexscan, but
>>> *not* in a bitmap indexscan, because the index AM APIs are
>>> totally different in the two cases. In a bitmap scan, nothing
>>> more than a TID bitmap is ever returned out to anyplace that
>>> could execute arbitrary expressions.
>
>> I had thought it must already be able to execute arbitrary
>> expressions, due to the ability to already support user-defined
>> btree ops (and ops of non-btree types in the case of other index
>> types).
>
> No. An index AM is only expected to be able to evaluate clauses of
> the form <indexed_column> <indexable_operator> <constant>, and the
> key restriction there is that the operator is one that the AM has
> volunteered to support. Well, actually, it's the opclass more than
> the AM that determines this, but anyway it's not just some random
> operator; more than likely, the AM and/or opclass has got special
> logic about the operator.

Isn't that pretty much exactly the point made by Jeff and Simon, that
index AM is currently only allowed to handle the indexable operators,
i.e. operators that it can explicitly optimize (e.g. use to walk the
btree and such), and completely ignores the other operators despite
having all the columns in the index. Which means we'll have to do the
heap fetch, which usually means a significant performance hit.

>
> This also ties into Robert's point about evaluation of operators
> against index entries for dead or invisible rows. Indexable operators
> are much less likely than others to have unexpected side-effects.

I certainly understand there are cases that require care - like the
leakproof thing pointed out by Robert for example. I don't immediately
see why evaluation against dead rows would be a problem.

But maybe we can derive a set of rules required from the operators? Say
only those marked as leakproof when RLS is enabled on the table, and
perhaps additional things.

A "bruteforce" way would be to extend each index AM with every possible
operator, but that's not quite manageable I guess. But why couldn't we
provide a generic infrastructure that would allow filtering "safe"
expressions and validating them on an index tuple?

kind regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2015-11-05 02:30:47 Re: WIP: Make timestamptz_out less slow.
Previous Message Kouhei Kaigai 2015-11-05 01:58:00 Re: Foreign join pushdown vs EvalPlanQual