Re: Returning nbtree posting list TIDs in DESC order during backwards scans

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Mircea Cadariu <cadariu(dot)mircea(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Returning nbtree posting list TIDs in DESC order during backwards scans
Date: 2025-07-17 18:26:35
Message-ID: CAH2-WznRwooVLZ4pZFFqZx7hT7HsSdt_et9eAL2EYG6qv0pT7A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 16, 2025 at 5:27 PM Mircea Cadariu <cadariu(dot)mircea(at)gmail(dot)com> wrote:
> Does the above change mean it will have to do more work in the loop?
> Whereas before it visited strictly killed, it now has to go through all
> of them?

Yes, that's true. Any item that the scan returns from the so->currPos
page needs to be considered within the loop.

The loop has an early check for this (for non-itemDead entries) here:

/* Quickly skip over items never ItemDead-set by btgettuple */
if (!kitem->itemDead)
continue;

I really doubt that this matters, because:

* This can only happen when we actually call _bt_killitems in the
first place, so there has to be at least one item whose index tuple
really does need to be LP_DEAD-set.

* The chances of there being a huge number of so->currPos.items[]
items but only one or two with their "itemDead" bit set seems low, in
general.

* The new loop is significantly simpler in that it iterates through
so->currPos.items[] in order, without any of the so->killedItems[]
indirection you see on HEAD. Modern CPUs are likely to skip over
non-itemDead entries very quickly.

Note that so->killedItems[] (which this patch removes) can be in
ascending leaf-page-wise order, descending leaf-page-wise order, or
(with a scrollable cursor) some random mix of the two -- even when
there's no posting list tuples involved.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-07-17 18:30:37 Re: Covering the comparison between date and timestamp, tz, type
Previous Message Sami Imseih 2025-07-17 18:19:44 Re: track generic and custom plans in pg_stat_statements