Quick Links

Re: Use of additional index columns in rows filtering

From:	Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To:	Peter Geoghegan <pg(at)bowt(dot)ie>, Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	James Coleman <jtc331(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Maxim Ivanov <hi(at)yamlcoder(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, markus(dot)winand(at)winand(dot)at
Subject:	Re: Use of additional index columns in rows filtering
Date:	2023-08-02 13:48:28
Message-ID:	5af6e20f-130b-dbe2-7501-11a1c545e49f@enterprisedb.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 8/2/23 02:50, Peter Geoghegan wrote:
> On Mon, Jul 24, 2023 at 11:59 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>>> That might be true but I'm not sure what to do about that unless we
>>> incorporate some "robustness" measure into the costing. If every
>>> measure we have says one plan is better, don't we have to choose it?
>>
>> I'm mostly concerned about the possibility itself -- it's not a matter
>> of tuning the cost. I agree that that approach would probably be
>> hopeless.
>
> This seems related to the fact that EXPLAIN doesn't expose the
> difference between what Markus Winand calls "Access Predicates" and
> "Index Filter Predicates", as explained here:
>
> https://use-the-index-luke.com/sql/explain-plan/postgresql/filter-predicates
>
> That is, both "Access Predicates" and "Index Filter Predicates" are
> shown after an "Index Cond: " entry in Postgres EXPLAIN output, in
> general. Even though these are two very different things. I believe
> that the underlying problem for the implementation (the reason why we
> can't easily break this out further in EXPLAIN output) is that we
> don't actually know what kind of predicate it is ourselves -- at least
> not until execution time. We wait until then to do nbtree
> preprocessing/scan setup. Though perhaps we should do more of this
> during planning instead [1], for several reasons (fixing this is just
> one of those reasons).
>

How come we don't know that until the execution time? Surely when
building the paths/plans, we match the clauses to the index keys, no? Or
are you saying that just having a scan key is not enough for it to be
"access predicate"?

Anyway, this patch is mostly about "Index Cond" mixing two types of
predicates. But the patch is really about "Filter" predicates - moving
some of them from table to index. So quite similar to the "index filter
predicates" except that those are handled by the index AM.

> The risk to "robustness" for cases like the one I drew attention to on
> this thread would probably have been obvious all along if EXPLAIN
> output were more like what Markus would have us do -- he certainly has
> a good point here, in general.
>
> Breaking things out in EXPLAIN output along these lines might also
> give us a better general sense of when a similar plan shift like this
> was actually okay -- even according to something like my
> non-traditional "query robustness" criteria. It's much harder for me
> to argue that a shift in plans from what Markus calls an "Index Filter
> Predicate" to what the patch will show under "Index Filter:" is a
> novel new risk. That would be a much less consequential difference,
> because those two things are fairly similar anyway.
>

But differentiating between access and filter predicates (at the index
AM level) seems rather independent of what this patch aims to do.

FWIW I agree we should make the differences visible in the explain. That
seems fairly useful for non-trivial index access paths, and it does not
change the execution at all. I think it'd be fine to do that only for
VERBOSE mode, and only for EXPLAIN ANALYZE (if we only do this at
execution time for now).

> Besides, such a shift in plan would have to "legitimately win" for
> things to work out like this. If we're essentially picking between two
> different subtypes of "Index Filter Predicate", then there can't be
> the same weird second order effects that we see when an "Access
> Predicate" is out-competed by an "Index Filter Predicate". It's
> possible that expression evaluation of a small-ish conjunctive
> predicate like "Index Filter: ((tenthous = 1) OR (tenthous = 3) OR
> (tenthous = 42))" will be faster than a natively executed SAOP. You
> can't do that kind of expression evaluation in the index AM itself
> (assuming that there is an opclass for nbtree to use in the first
> place, which there might not be in the case of any non-key INCLUDE
> columns). With the patch, you can do all this. And I think that you
> can derisk it without resorting to the overly conservative approach of
> limiting ourselves to non-key columns from INCLUDE indexes.
>

I'm not following. Why couldn't there be some second-order effects?
Maybe it's obvious / implied by something said earlier, but I think
every time we decide between multiple choices, there's a risk of picking
wrong.

Anyway, is this still about this patch or rather about your SAOP patch?

> To summarize: As Markus says on the same page. "Index filter
> predicates give a false sense of safety; even though an index is used,
> the performance degrades rapidly on a growing data volume or system
> load". That's essentially what I want to avoid here. I'm much less
> concerned about competition between what are really "Index Filter
> Predicate" subtypes. Allowing that competition to take place is not
> entirely risk-free, of course, but it seems about as risky as anything
> else in this area.
>

True. IMHO the danger or "index filter" predicates is that people assume
index conditions eliminate large parts of the index - which is not
necessarily the case here. If we can improve this, cool.

But again, this is not what this patch does, right? It's about moving
stuff from "table filter" to "index filter". And those clauses were not
matched to the index AM at all, so it's not really relevant to the
discussion about different subtypes of predicates.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Use of additional index columns in rows filtering at 2023-08-02 00:50:14 from Peter Geoghegan

Responses

Re: Use of additional index columns in rows filtering at 2023-08-03 01:32:40 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Rui Zhao	2023-08-02 14:38:00	Re: pg_upgrade fails with in-place tablespace
Previous Message	Amit Langote	2023-08-02 13:39:45	Re: generic plans and "initial" pruning