Re: Use of additional index columns in rows filtering

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>, Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Maxim Ivanov <hi(at)yamlcoder(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Use of additional index columns in rows filtering
Date: 2023-07-20 11:35:40
Message-ID: b55abaeb-38df-930b-d42a-b1edeb81d2e8@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/19/23 23:38, Peter Geoghegan wrote:
> On Wed, Jul 19, 2023 at 1:46 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> On Wed, 2023-07-19 at 20:03 +0200, Tomas Vondra wrote:
>>> Makes sense, I also need to think about maybe not having duplicate
>>> clauses in the two lists. What annoys me on that it partially
>>> prevents
>>> the cost-based reordering done by order_qual_clauses(). So maybe we
>>> should have three lists ... Also, some of the expressions count be
>>> fairly expensive.
>>
>> Can we just calculate the costs of the pushdown and do it when it's a
>> win? If the random_page_cost savings exceed the costs from evaluating
>> the clause earlier, then push down.
>
> My patch that teaches nbtree to execute ScalarArrayOps intelligently
> (by dynamically choosing to not re-descend the btree to perform
> another "primitive index scan" when the data we need is located on the
> same leaf page as the current ScalarArrayOps arrays) took an
> interesting turn recently -- one that seems related.
>
> I found that certain kinds of queries are dramatically faster once you
> teach the optimizer to accept that multi-column ScalarArrayOps can be
> trusted to return tuples in logical/index order, at least under some
> circumstances. For example:
>
> pg(at)regression:5555 [583930]=# create index order_by_saop on
> tenk1(two,four,twenty);
> CREATE INDEX
>
> pg(at)regression:5555 [583930]=# EXPLAIN (ANALYZE, BUFFERS)
> select ctid, thousand from tenk1
> where two in (0,1) and four in (1,2) and twenty in (1,2)
> order by two, four, twenty limit 20;
>
> This shows "Buffers: shared hit=1377" on HEAD, versus "Buffers: shared
> hit=13" with my patch. All because we can safely terminate the scan
> early now. The vast majority of the buffer hits the patch will avoid
> are against heap pages, even though I started out with the intention
> of eliminating unnecessary repeat index page accesses.
>
> Note that build_index_paths() currently refuses to allow SAOP clauses
> to be recognized as ordered with a multi-column index and a query with
> a clause for more than the leading column -- that is something that
> the patch needs to address (to get this particular improvement, at
> least). Allowing such an index path to have useful pathkeys is
> typically safe (in the sense that it won't lead to wrong answers to
> queries), but we still make a conservative assumption that they can
> lead to wrong answers. There are comments about "equality constraints"
> that describe the restrictions right now.
>
> But it's not just the question of basic correctness -- the optimizer
> is very hesitant to use multi-column SAOPs, even in cases that don't
> care about ordering. So it's also (I think, implicitly) a question of
> *risk*. The risk of getting very inefficient SAOP execution in nbtree,
> when it turns out that a huge number of "primitive index scans" are
> needed. But, if nbtree is taught to "coalesce together" primitive
> index scans at runtime, that risk goes way down.
>
> Anyway, this seems related to what you're talking about because the
> relationship between selectivity and ordering seems particularly
> important in this context. And because it suggests that there is at
> least some scope for adding "run time insurance" to the executor,
> which is valuable in the optimizer if it bounds the potential
> downside. If you can be practically certain that there is essentially
> zero risk of serious problems when the costing miscalculates (for a
> limited subset of cases), then life might be a lot easier -- clearly
> we should be biased in one particular direction with a case that has
> that kind of general profile.
>
> My current understanding of the optimizer side of things --
> particularly things like costing for "filter quals/pqquals" versus
> costing for "true index quals" -- is rather basic. That will have to
> change. Curious to hear your thoughts (if any) on how what you're
> discussing may relate to what I need to do with my patch. Right now my
> patch assumes that making SAOP clauses into proper index quals (that
> usually preserve index ordering) is an unalloyed good (when safe!).
> This assumption is approximately true on average, as far as I can
> tell. But it's probably quite untrue in various specific cases, that
> somebody is bound to care about.
>

I think the SAOP patch may need to be much more careful about this, but
for this patch it's simpler because it doesn't really change any of the
index internals, or the indexscan in general.

If I simplify that a bit, we're just reordering the clauses in a way to
maybe eliminate the heap fetch. The main risk seems to be that this will
force an expensive qual to the front of the list, just because it can be
evaluated on the index tuple. But the difference would need to be worse
than what we save by not doing the I/O - considering how expensive the
I/O is, that seems unlikely. Could happen for expensive quals that don't
really eliminate many rows, I guess.

Anyway, I see this as extension of what order_qual_clauses() does. The
main issue is that even order_qual_clauses() admits the estimates are
somewhat unreliable, so we can't expect to make perfect decisions.

FWIW: While reading order_qual_clauses() I realized the code may need to
be more careful about leakproof stuff. Essentially, if any of the
non-index clauses is leakproof, we can only do leakproof quals on the index.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melih Mutlu 2023-07-20 11:42:29 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Previous Message shveta malik 2023-07-20 11:34:44 Re: Synchronizing slots from primary to standby