Re: Buffer locking is special (hints, checksums, AIO writes)

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: Buffer locking is special (hints, checksums, AIO writes)
Date: 2025-10-07 16:40:48
Message-ID: CAEze2WjeK9CY003S4dmCugv_H4tz9AaXgnqW+wTc=BaPDg+2xg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Tue, 7 Oct 2025 at 00:55, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2025-10-04 09:05:45 +0200, Matthias van de Meent wrote:
> > 0001 ensures that ReadRecentBuffer increments the usage counter, which
> > someone who uses an access strategy may want to prevent. I know this
> > isn't exactly new behaviour, but something I noticed anyway. Apart
> > from that observation, LGTM
>
> Are you proposing to change behaviour?

Eventually, yes, but not necessarily now or with this patchset.

> Right now ReadRecentBuffer doesn't even
> accept a strategy, so I don't really see this as something that needs to be
> tackled at this point.
>
> I'm not sure I see any real use cases for ReadRecentBuffer() that would
> benefit from a strategy, but I very well might just not be thinking wide
> enough.

I think it's rather strange that there is no Extended variant of
ReadRecentBuffer, like how there is a ReadBufferExtended for
ReadBuffer. Yes, ReadRecentBuffer has more arguments to fill and so
has a smaller difference versus ReadBufferExtended, but it's no
complete replacement for ReadBuffer[Ext] when you're aware of a recent
buffer of the page.

I'm not saying it will definitely happen, but I could see that e.g.
amcheck might want to keep track of buffer IDs of recent heap pages it
accessed to verify index's results, without holding a pin on all the
pages; and instead using ReadRecentBuffer[Extended] with a
BufferAccessStrategy to allow re-acquiring the buffer pin without
blowing out shared buffers or making parts of the pool take forever to
evict again.

> > 0003's UnlockBufHdrExt:
> > This is implemented with CAS, even when we only want to change bits we
> > know the state of (or could know, if we spent the effort).
> > Given its inline nature, wouldn't it be better to use atomic_sub
> > instructions? Or is this to handle cases where the bits we want to
> > (un)set might be (un)set by a concurrent process?
>
> Yes, it's to handle concurrent changes to the buffer state.
>
> > If the latter, could we specialize this to do a single atomic_sub
> > whenever we want to change state bits that we know can be only changed
> > whilst holding the spinlock?
>
> We probably could optimize some cases as an atomic-sub, some others as an
> atomic-and and others again as an atomic-or. The latter to however are
> implemented as a CAS on x86 anyway...
>
> After 0004 I don't think any of the paths using this are actually particularly
> hot, so I'm somewhat doubtful it's worth to try to optimize this too much. If
> there are hot paths, we really should try to work towards not even needing the
> buffer header spinlock, that has a bigger impact that improving the code for
> unlocking the buffer header...

Fair enough; I guess we'll see if further optimization would have much
impact once this all has been committed.

Kind regards,

Matthias van de Meent
Databricks

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-10-07 16:51:20 Re: Optimize LISTEN/NOTIFY
Previous Message Nathan Bossart 2025-10-07 16:26:03 Re: Add mode column to pg_stat_progress_vacuum