Re: Use of additional index columns in rows filtering

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Maxim Ivanov <hi(at)yamlcoder(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Use of additional index columns in rows filtering
Date: 2023-07-21 03:32:33
Message-ID: CAH2-WzmYOhJHTCXEq36Xra5ajbX+bgawTJ9ADimOqTX0W+4ucw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 20, 2023 at 4:35 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> I think the SAOP patch may need to be much more careful about this, but
> for this patch it's simpler because it doesn't really change any of the
> index internals, or the indexscan in general.

It's true that the SAOP patch needs relatively complicated
infrastructure to assess whether or not the technique is safe with a
given set of quals. You cannot safely get an ordered index scan for
something like "select * from table where a < 5 and b in (1,2,3) order
by a, b". With or without my patch. My patch isn't really changing all
that much about the behavior in nbtree, as these things go. It's
*surprising* how little has to change about the high level structure
of index scans, in fact.

(Actually, I'm glossing over a lot. The MDAM paper describes
techniques that'd make even the really challenging cases safe, through
a process of converting quals from conjunctive normal form into
disjunctive normal form. This is more or less the form that the state
machine implemented by _bt_advance_array_keys() produces already,
today. But even with all this practically all of the heavy lifting
takes place before the index scan even begins, during preprocessing --
so you still require surprisingly few changes to index scans
themselves.)

> If I simplify that a bit, we're just reordering the clauses in a way to
> maybe eliminate the heap fetch. The main risk seems to be that this will
> force an expensive qual to the front of the list, just because it can be
> evaluated on the index tuple.

My example query might have been poorly chosen, because it involved a
limit. What I'm thinking of is more general than that.

> But the difference would need to be worse
> than what we save by not doing the I/O - considering how expensive the
> I/O is, that seems unlikely. Could happen for expensive quals that don't
> really eliminate many rows, I guess.

That sounds like the same principle that I was talking about. I think
that it can be pushed quite far, though. I am mostly talking about the
worst case, and it seems like you might not be.

You can easily construct examples where some kind of skew causes big
problems with a multi-column index. I'm thinking of indexes whose
leading columns are low cardinality, and queries where including the
second column as an index qual looks kind of marginal to the
optimizer. Each grouping represented in the most significant index
column might easily have its own unique characteristics; the
distribution of values in subsequent columns might be quite
inconsistent across each grouping, in whatever way.

Since nothing stops a qual on a lower order column having a wildly
different selectivity for one particular grouping, it might not make
sense to say that a problem in this area is due to a bad selectivity
estimate. Even if we have perfect estimates, what good can they do if
the optimal strategy is to vary our strategy at runtime, *within* an
individual index scan, as different parts of the key space (different
groupings) are traversed through? To skip or not to skip, say. This
isn't about picking the cheapest plan, really.

That's another huge advantage of index quals -- they can (at least in
principle) allow you skip over big parts of the index when it just
ends up making sense, in whatever way, for whatever reason. In the
index, and in the heap. Often both. You'd likely always prefer to err
in the direction of having more index quals rather than fewer, when
doing so doesn't substantially change the plan itself. It could be
very cheap insurance, even without any limit. (It would probably also
be a lot faster, but it needn't be.)

> Anyway, I see this as extension of what order_qual_clauses() does. The
> main issue is that even order_qual_clauses() admits the estimates are
> somewhat unreliable, so we can't expect to make perfect decisions.

The attribute value independence assumption is wishful thinking, in no
small part -- it's quite surprising that it works as well as it does,
really.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2023-07-21 03:40:54 Buildfarm failures on chipmunk
Previous Message Michael Paquier 2023-07-21 03:08:47 Re: Support worker_spi to execute the function dynamically.