Quick Links

Re: index prefetching

From:	Peter Geoghegan <pg(at)bowt(dot)ie>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Tomas Vondra <tomas(at)vondra(dot)me>, Alexandre Felipe <o(dot)alexandre(dot)felipe(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject:	Re: index prefetching
Date:	2026-04-03 05:17:14
Message-ID:	CAH2-WzkiCK=wELiXPgriN4r7cJzGb3Xg48E9YHrFEyEPTkynOw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Apr 1, 2026 at 6:50 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Attached is v20, which does things this way. Now it's completely clear
> that the scan's batch ring buffer is fully controlled by the table AM.

Attached is v21, which advances things in the direction established in
v20. It also fixes bitrot -- we no longer carry certain read stream
related patches from Andres that were committed since I posted v20 a
couple of days ago.

> There are now only 2 remaining indexbatch.c functions that get called
> from indexam.c. Since these are both trivial, highly generic
> functions, restructuring things so that heapam would call these two
> seemed unnecessary.

I changed my mind about this: I now think it's better if indexam.c
never directly calls indexbatch.c. Just for flexibility and
extensibility. That's how it's done in v21.

In v21, the code in indexbatch.c is about 80% table AM code, 20% index
AM code -- nothing more. That seems a bit cleaner to me, because the
table AM knowing that the batchringbuf must have already been
initialized on its behalf is no longer requires (nothing like that is
required). The table AM fully owns batchringbuf, and initializes it
for itself where needed.

This means I've had to add a new table AM callback to take a mark,
which isn't strictly necessary for heapam -- the heapam implementation
just calls the relevant indexbatch.c routine, which in practice
handles everything that's needed by heapam on its own. My guess is
that Andres will slightly prefer things that way, since of course it's
possible that some external table AM will need non-trivial code just
to take a mark (restoring a mark has to be a bit more complicated to
account for the read stream, which is why v20 already had one).

We're losing 2 callbacks from the index AM API (for mark/restore), but
gaining 2 very similar ones in the table AM API.

We now expect the table AM's index_fetch_begin callback to set
IndexScanDescData.xs_getnext_slot for the scan itself -- another thing
that happens in the table AM rather than in indexam.c in v21. That way
we don't need 4 extra TableAmRoutine callbacks (one for every
combination of amgettuple, amgetbatch, index-only, plain).

I don't think requiring exactly 4 slot-based variants set within
TableAmRoutine, as was the case with the last several patch versions,
was sensible. In v21, the table AM can provide as few or as many
callbacks as it likes (obviously at least one callback must exist).
This works fine, provided the table AM is sure that the callback will
be correct for the duration of that particular scan, based on what is
known about the scan when it begins. (In practice this means the table
AM exploiting knowledge of whether the scan is index-only or plain, or
whether it's amgettuple or amgetbatch -- but other variations might
make sense in the future.)

--
Peter Geoghegan

Attachment	Content-Type	Size
v21-0012-Hacky-implementation-of-making-read_stream_reset.patch	application/octet-stream	5.1 KB
v21-0015-WIP-aio-bufmgr-Fix-race-condition-leading-to-dea.patch	application/octet-stream	3.1 KB
v21-0013-WIP-read-stream-Split-decision-about-look-ahead-.patch	application/octet-stream	14.8 KB
v21-0001-Rename-heapam_index_fetch_tuple-argument-for-cla.patch	application/octet-stream	2.6 KB
v21-0014-aio-Fix-pgaio_io_wait-for-staged-IOs-B.patch	application/octet-stream	6.3 KB
v21-0011-read_stream-Only-increase-distance-when-waiting-.patch	application/octet-stream	4.2 KB
v21-0009-Make-hash-index-AM-use-amgetbatch-interface.patch	application/octet-stream	47.3 KB
v21-0008-heapam-Add-index-scan-I-O-prefetching.patch	application/octet-stream	46.1 KB
v21-0010-aio-io_uring-Trigger-async-processing-for-large-.patch	application/octet-stream	7.2 KB
v21-0007-heapam-Optimize-pin-transfers-during-index-scans.patch	application/octet-stream	6.7 KB
v21-0006-Add-interfaces-that-enable-index-prefetching.patch	application/octet-stream	239.4 KB
v21-0003-Add-slot-based-table-AM-index-scan-interface.patch	application/octet-stream	78.4 KB
v21-0004-heapam-Track-heap-block-in-IndexFetchHeapData.patch	application/octet-stream	4.7 KB
v21-0005-heapam-Keep-buffer-pins-across-index-rescans.patch	application/octet-stream	3.9 KB
v21-0002-Move-heapam_handler.c-index-scan-code-to-new-fil.patch	application/octet-stream	21.3 KB

In response to

Re: index prefetching at 2026-04-01 22:50:15 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Ashutosh Bapat	2026-04-03 05:37:03	Re: SQL/PGQ: All properties reference
Previous Message	Pavel Stehule	2026-04-03 05:15:41	Re: proposal: schema variables