Re: index prefetching

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2025-12-04 05:54:09
Message-ID: CA+HiwqGO9DKT_edGTdTiogqPekG_tCsYDEb_o8B0C+HZ2K9aNg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Peter,

On Mon, Dec 1, 2025 at 10:24 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Mon, Nov 10, 2025 at 6:59 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > The new tentative plan is to cut scope by focussing on switching over
> > to the new index AM + table AM interface from the patch in the short
> > term, for Postgres 19.
>
> Attached patch makes the table AM revisions we talked about. This is a
> significant change in direction, so I'm adopting a new patch
> versioning scheme: this new version is v1. (I just find it easier to
> deal with sequential patch version numbers.)
>
> I'm sure that I'll have made numerous mistakes in this new v1. There
> will certainly be some bugs, and some of the exact details of how I'm
> doing the layering are likely suboptimal or even wrong. I am
> nevertheless cautiously optimistic that this will be the last major
> redesign that will be required for this project.
>
...
>
> Here are the things that I'd like to ask from reviewers, and from Tomas:
>
> * Review of the table AM changes, with a particular emphasis on high
> level architectural choices.
>
> * Most importantly: will the approach in this new v1 avoid painting
> ourselves into a corner? It can be incomplete, as long as it doesn't
> block progress on things we're likely to want to do in the next couple
> of releases.

I was looking at your email and the v1 patch and recalled your earlier
note from my executor batching thread [1], where you mentioned:

"I think that the base index prefetching patch's current notion of
index-AM-wise batches can be kept quite separate from any table AM
batch concept that might be invented, either as part of what I'm
working on, or in Amit's patch. It probably wouldn't be terribly
difficult to get the new interface I've described to return heap
tuples in whatever batch format Amit comes up with. ... I doubt that
adopting Amit's batch format will make life much harder for the
heap_hot_search_buffer-batching mechanism (at least if it is generally
understood that its new index scan interface's builds batches in
Amit's format on a best-effort basis)."

I want to acknowledge that figuring out the right layering to make I/O
prefetching and perhaps other optimizations internal to IndexNext()
work is obviously the priority right now, regardless of the output
format used to populate the slots ultimately returned by
table_index_getnext_slot(). However, regarding your question about
"painting ourselves into a corner":

In my executor batching work (which has focused on Seq Scans), the
HeapBatch is essentially just a pinned buffer plus an array of
pre-allocated tuple headers. I hadn't strictly considered creating a
HeapBatch to return from Index Scans, largely because
heap_hot_search_buffer() is designed for scalar (or non-batched)
access that requires repeated buffer locking.

But it seems like the eventual goal of batching calls to
heap_hot_search_buffer() effectively clears that hurdle. As long as
the internal logic separates the "grouping/locking" from the
"materializing into a slot," it seems this design does not prevent us
from eventually wiring up a table_index_getnext_batch() to populate
the HeapBatch structure I am proposing for the regular non-index scan
path (table_scan_getnextbatch() in my patch).

Sorry to hijack the thread, but just wanted to confirm I haven't
misunderstood the architectural implications for future batching. Now
off to continue reading the new indexbatch.c, which kind of reminds me
of the stuff I've added in my execBatch.c. :-)

--
Thanks, Amit Langote

[1] https://www.postgresql.org/message-id/CAH2-WznijhPtw2vtwCtfFSwamwkT2O1KXMx6tE%2BeoHi3CKwRFg%40mail.gmail.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-12-04 05:58:03 Re: Newly created replication slot may be invalidated by checkpoint
Previous Message Michael Paquier 2025-12-04 05:50:09 Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)