| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, Alexandre Felipe <o(dot)alexandre(dot)felipe(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
| Subject: | Re: index prefetching |
| Date: | 2026-03-10 22:47:52 |
| Message-ID: | y5wp4uxudeajyljuzdm4cmqvwmzlujwzkxbadimoa64cmybgjp@5dd7le2jxc5m |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On 2026-03-10 16:57:35 -0400, Peter Geoghegan wrote:
> On Fri, Feb 27, 2026 at 6:52 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > This is a huge change. Is there a chance we can break it up into more
> > manageable chunks?
>
> Attached is v12, which has revisions that address most of your
> feedback items. It also includes items that address problems that I
> noticed during performance validation work.
>
> Highlights:
>
> * Substantial revisions that give table AMs and index AMs direct
> control over batch layout -- without giving up on batch
> recycling/caching. This is essentially what you (Andres) requested
> because the design from v11 was not sufficiently AM agnostic. In
> particular:
>
> - Table AMs now control the size and layout of visibility information
> (in practice heapam uses this to store per-item visibility state from
> the visibility map).
>
> - Index AMs have their own opaque state for things like sibling link
> block numbers, avoiding the assumption that other index AMs supporting
> amgetbatch will need to work like nbtree and hash as regards how they
> navigate to the next index page/index keyspace associated with each
> batch.
Nice!
> * No more read stream yielding. Numerous new patches from Andres are
> now included, which helps with this. In particular, "WIP: read_stream:
> Only increase distance when waiting for IO" fixes the problematic
> regression in an adversarial query -- the one that prompted me to
> invent yielding in the first place. As a result of all this, the read
> stream callback added by the prefetching commit itself is now
> substantially simpler than it was in v11.
Yay.
> * There are now a couple of extra patches created by breaking things
> into more distinct commits. Namely, there's a new "heapam: Track heap
> block in IndexFetchHeapData using xs_blk" commit, as well as a new
> "Make IndexScanInstrumentation a pointer in executor scan nodes"
> commit.
Yay^2.
> * Moreover, some commits now appear in a slightly different order,
> prioritizing work closer to being committable; those commits now come
> first.
Yay^3.
> * New commit "Use simple hash for PrivateRefCount" addresses some of
> the problems we were seeing with PrivateRefCount performance. This
> generic optimization addresses an existing problem that would
> otherwise be much worse with the index prefetching work in place.
Let's get that in soon.
Alexandre Felipe posted an implementation of this in
https://postgr.es/m/CAE8JnxNTETEUiAOF31%3D_yo%3DpvyAi9npOeJfcTvEJJbi4vomtYA%40mail.gmail.com
I don't agree with many of the other changes, but the simplehash conversion
contains an interesting piece - the ability to avoid the status field. I'd
encourage Alexandre to upstream that separately from this thread (and also
separately from the rest of the patches in the above thread).
> However, I have NOT yet acted on a few feedback items from Andres:
>
> * I still don't know what Andres meant about requiring table AMs to
> free batch index page buffer pins representing a modularity violation.
> I don't see how we can reasonably avoid it while still preserving the
> guarantees needed to safely drop buffer pins eagerly during index-only
> scans that require prefetching.
>
> * I'm also not at all sure what Andres meant about index AMs like hash
> not holding onto their own buffer pins, given that prefetching uses a
> read stream sensitive to the number of buffer pins the backend holds.
I tried to respond in
https://postgr.es/m/vbb4naf2tvm2tm7yoml54pzvrmn77p4nvq4awfa4wufc3hn7qx%40mof5q6li3xzv
to explain my concerns / what I think needs to happen.
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2026-03-10 23:04:37 | Re: Streamify more code paths |
| Previous Message | Zsolt Parragi | 2026-03-10 22:40:48 | Re: Make PGOAUTHCAFILE in libpq-oauth work out of debug mode |