| From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, Alexandre Felipe <o(dot)alexandre(dot)felipe(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
| Subject: | Re: index prefetching |
| Date: | 2026-04-03 05:17:14 |
| Message-ID: | CAH2-WzkiCK=wELiXPgriN4r7cJzGb3Xg48E9YHrFEyEPTkynOw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Apr 1, 2026 at 6:50 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Attached is v20, which does things this way. Now it's completely clear
> that the scan's batch ring buffer is fully controlled by the table AM.
Attached is v21, which advances things in the direction established in
v20. It also fixes bitrot -- we no longer carry certain read stream
related patches from Andres that were committed since I posted v20 a
couple of days ago.
> There are now only 2 remaining indexbatch.c functions that get called
> from indexam.c. Since these are both trivial, highly generic
> functions, restructuring things so that heapam would call these two
> seemed unnecessary.
I changed my mind about this: I now think it's better if indexam.c
never directly calls indexbatch.c. Just for flexibility and
extensibility. That's how it's done in v21.
In v21, the code in indexbatch.c is about 80% table AM code, 20% index
AM code -- nothing more. That seems a bit cleaner to me, because the
table AM knowing that the batchringbuf must have already been
initialized on its behalf is no longer requires (nothing like that is
required). The table AM fully owns batchringbuf, and initializes it
for itself where needed.
This means I've had to add a new table AM callback to take a mark,
which isn't strictly necessary for heapam -- the heapam implementation
just calls the relevant indexbatch.c routine, which in practice
handles everything that's needed by heapam on its own. My guess is
that Andres will slightly prefer things that way, since of course it's
possible that some external table AM will need non-trivial code just
to take a mark (restoring a mark has to be a bit more complicated to
account for the read stream, which is why v20 already had one).
We're losing 2 callbacks from the index AM API (for mark/restore), but
gaining 2 very similar ones in the table AM API.
We now expect the table AM's index_fetch_begin callback to set
IndexScanDescData.xs_getnext_slot for the scan itself -- another thing
that happens in the table AM rather than in indexam.c in v21. That way
we don't need 4 extra TableAmRoutine callbacks (one for every
combination of amgettuple, amgetbatch, index-only, plain).
I don't think requiring exactly 4 slot-based variants set within
TableAmRoutine, as was the case with the last several patch versions,
was sensible. In v21, the table AM can provide as few or as many
callbacks as it likes (obviously at least one callback must exist).
This works fine, provided the table AM is sure that the callback will
be correct for the duration of that particular scan, based on what is
known about the scan when it begins. (In practice this means the table
AM exploiting knowledge of whether the scan is index-only or plain, or
whether it's amgettuple or amgetbatch -- but other variations might
make sense in the future.)
--
Peter Geoghegan
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Bapat | 2026-04-03 05:37:03 | Re: SQL/PGQ: All properties reference |
| Previous Message | Pavel Stehule | 2026-04-03 05:15:41 | Re: proposal: schema variables |