Re: index prefetching

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2026-01-13 20:36:28
Message-ID: CAH2-Wz=6a7fGz2rALDX+xiFDuEaGQWpZ49xEaBUDKiPH8gcL+Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 7, 2026 at 1:50 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> v6 focusses on simplifying the batch management code in
> heapam_batch_getnext_tid. Importantly, heapam_batch_getnext_tid no
> longer uses a loop to process items from the currently loaded batch/to
> load the next batch. The control flow in heapam_batch_getnext_tid is a
> lot simpler in general compared to v5.

The batch stopped applying again. Attached is v7.

Changes:

* Much improved read stream callback code, with comments that explain
exactly what's going on at each step.

* We once again support prefetching with index-only scans (for any
heap fetches that might be required).

We must cache visibility information at the level of whole batches for
this. Otherwise, the read stream have a different idea about which
heap pages are considered all visible from other code. The
corresponding code that actually reads heap tuples expects to be able
to get buffers from the read stream that precisely match what it
believes are required for any heap fetches. If they don't both agree
about visibility info, chaos ensues.

We avoid per-tuple visibility lookups in v7, preferring to do
everything (every VM lookup for every TID) up front for each batch.
This is simpler, and I suspect it's somewhat faster on average.

* Fixed several bugs involving scrollable cursors + index prefetching.

> I still haven't had time to produce an implementation of the "heap
> buffer locking minimization" optimization that's clean enough to
> present to the list.

Still haven't done this. Our new thinking on this is that it'd be best
to get prefetching in better shape before proceeding with the heap
buffer locking optimization.

--
Peter Geoghegan

Attachment Content-Type Size
v7-0004-Make-hash-index-AM-use-amgetbatch-interface.patch application/octet-stream 37.1 KB
v7-0002-Add-prefetching-to-index-scans-using-batch-interf.patch application/octet-stream 27.5 KB
v7-0003-bufmgr-aio-Prototype-for-not-waiting-for-already-.patch application/octet-stream 6.9 KB
v7-0001-Add-batching-interfaces-used-by-heapam-and-nbtree.patch application/octet-stream 201.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message harinath kanchu 2026-01-13 20:47:48 [PATCH] Add permit_unlogged_tables GUC to control unlogged table creation.
Previous Message Peter Eisentraut 2026-01-13 20:28:36 Re: Proposal: SELECT * EXCLUDE (...) command