Re: Buffer locking is special (hints, checksums, AIO writes)

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: Buffer locking is special (hints, checksums, AIO writes)
Date: 2025-10-04 07:05:45
Message-ID: CAEze2WgGe8vjj3jiWqUugWuwLJ9cLryaGrnASjm-yJ=tEALX2A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 23 Sept 2025 at 00:14, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2025-09-15 19:05:37 -0400, Andres Freund wrote:
> > Here are the first few cleaned up patches implementing the above steps, as
> > well as some cleanups. I included a commit from another thread, as it
> > conflicts with these changes, and we really should apply it - and it's
> > arguably required to make the changes viable, as it removes one more use of
> > PinBuffer_Locked().
> >
> > Another change included is to not return the buffer with the spinlock held
> > from StrategyGetBuffer(), and instead pin the buffer in freelist.c. The reason
> > for that is to reduce the most common PinBuffer_locked() call. By definition
> > PinBuffer_locked() will become a bit slower due to 0003. But even without 0003
> > it 0002 is faster than master. And the previous approach also just seems
> > pretty unclean. I don't love that it requires the new TrackNewBufferPin(),
> > but I don't really have a better idea.
> >
> > I invite particular attention to the commit message for 0003 as well as the
> > comment changes in buf_internals.h within.
>
> Robert looked at the patches while we were chatting, and I addressed his
> feedback in this new version.

I like these changes, and have some minor comments:

0001 ensures that ReadRecentBuffer increments the usage counter, which
someone who uses an access strategy may want to prevent. I know this
isn't exactly new behaviour, but something I noticed anyway. Apart
from that observation, LGTM

0002 has a FIXME in a comment in GetVictimBuffer. Assuming it's about
the comment itself needing updates, how about:

+ * Ensure, before we pin a victim buffer, that there's a free refcount
+ * entry, and a resource owner slot for the pin.

Again, LGTM.

0003's UnlockBufHdrExt:
This is implemented with CAS, even when we only want to change bits we
know the state of (or could know, if we spent the effort).
Given its inline nature, wouldn't it be better to use atomic_sub
instructions? Or is this to handle cases where the bits we want to
(un)set might be (un)set by a concurrent process?
If the latter, could we specialize this to do a single atomic_sub
whenever we want to change state bits that we know can be only changed
whilst holding the spinlock?

0004: LGTM

0005: LGTM

0006: LGTM

Kind regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Xuneng Zhou 2025-10-04 07:21:07 Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
Previous Message Konstantin Osipov 2025-10-04 04:33:55 Re: Proposal: Exploring LSM Tree‑Based Storage Engine for PostgreSQL (Inspired by MyRocks)