Re: Use of additional index columns in rows filtering

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Maxim Ivanov <hi(at)yamlcoder(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, markus(dot)winand(at)winand(dot)at
Subject: Re: Use of additional index columns in rows filtering
Date: 2023-08-04 02:07:29
Message-ID: CAH2-WznrdAv-bj_Dd2c83fys2TXJ9MvU97EvH92PW7pDOC9iZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 3, 2023 at 3:04 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> Because my patch is all about reducing the heap pages, which are usually
> the expensive part of the index scan. But you're right the "index scan"
> with index filter may access more index pages, because it has fewer
> "access predicates".

It's not so much the unnecessary index page accesses that bother me.
At least I didn't push that aspect very far when I constructed my
adversarial test case -- index pages were only a small part of the
overall problem. (I mean the problem at runtime, in the executor. The
planner expected to save a small number of leaf page accesses, which
was kinda, sorta the problem there -- though the planner might have
technically still been correct about that, and can't have been too far
wrong in any case.)

The real problem that my adversarial case seemed to highlight was a
problem of extra heap page accesses. The tenk1 table's VM is less than
one page in size, so how could it have been VM buffer hits? Sure,
there were no "table filters" involved -- only "index filters". But
even "index filters" require heap access when the page isn't marked
all-visible in the VM.

That problem just cannot happen with a similar plan that eliminates
the same index tuples within the index AM proper (the index quals
don't even have to be "access predicates" for this to apply, either).
Of course, we never need to check the visibility of index tuples just
to be able to consider eliminating them via nbtree search scan
keys/index quals -- and so there is never any question of heap/VM
access for tuples that don't pass index quals. Not so for "index
filters", where there is at least some chance of accessing the heap
proper just to be able to eliminate non-matches.

While I think that it makes sense to assume that "index filters" are
strictly better than "table filters" (assuming they're directly
equivalent in that they contain the same clause), they're not
*reliably* any better. So "index filters" are far from being anywhere
near as good as an equivalent index qual (AFAICT we should always
assume that that's true). This is true of index quals generally --
this advantage is *not* limited to "access predicate index quals". (It
is most definitely not okay for "index filters" to displace equivalent
"access predicate index quals", but it's also not really okay to allow
them to displace equivalent "index filter predicate index quals" --
the latter case is less bad, but AFAICT they both basically aren't
acceptable "substitutions".)

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2023-08-04 02:09:58 Re: Using defines for protocol characters
Previous Message David Rowley 2023-08-04 01:28:42 Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)