Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, benoit <benoit(at)hopsandfork(dot)com>
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date: 2023-07-31 16:34:47
Message-ID: CAH2-Wz=Ts_kouW0WJ=Us_DdN1TuLgEFxQS1dBrGicwGvZr8gOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 27, 2023 at 10:00 AM Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
> My idea is not quite block nested loop join. It's more 'restart the
> index scan at the location the previous index scan ended, if
> heuristics say there's a good chance that might save us time'. I'd say
> it is comparable to the fast tree descent optimization that we have
> for endpoint queries, and comparable to this patch's scankey
> optimization, but across AM-level rescans instead of internal rescans.

Yeah, I see what you mean. Seems related, even though what you've
shown in your prototype patch doesn't seem like it fits into my
taxonomy very neatly.

(BTW, I was a little confused by the use of the term "endpoint" at
first, since there is a function that uses that term to refer to a
descent of the tree that happens without any insertion scan key. This
path is used whenever the best we can do in _bt_first is to descend to
the rightmost or leftmost page.)

> The basic design of that patch is this: We keep track of how many
> times we've rescanned, and the end location of the index scan. If a
> new index scan hits the same page after _bt_search as the previous
> scan ended, we register that.

I can see one advantage that block nested loop join would retain here:
it does block-based accesses on both sides of the join. Since it
"looks ahead" on both sides of the join, more repeat accesses are
likely to be avoided.

Not too sure how much that matters in practice, though.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alena Rybakina 2023-07-31 16:38:23 Re: POC, WIP: OR-clause support for indexes
Previous Message Dave Cramer 2023-07-31 16:27:45 Re: Request for comment on setting binary format output per session