Re: Buffer locking is special (hints, checksums, AIO writes)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: Buffer locking is special (hints, checksums, AIO writes)
Date: 2025-08-26 21:00:13
Message-ID: fumdcye25kkobwrjfb6d7m52gvlpzbvomr4onm6nsbcbrv24jg@fqydosoppg4l
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-08-26 16:21:36 -0400, Robert Haas wrote:
> On Fri, Aug 22, 2025 at 3:45 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > My conclusion from the above is that we ought to:
> >
> > A) Make Buffer Locks something separate from lwlocks
> > B) Merge BufferDesc.state and the content lock
> > C) Allow some modifications of BufferDesc.state while holding spinlock
>
> +1 to (A) and (B). No particular opinion on (C) but if it works well, great.

Without it I see performance regressions due to the increased rate of CAS
failures due to having more changes to one atomic variable :/

> > The order of changes I think makes the most sense is the following:
> >
> > 1) Allow some modifications while holding the buffer header spinlock
> > 2) Reduce buffer pin with just an atomic-sub
> > 3) Widen BufferDesc.state to 64 bits
> > 4) Implement buffer locking inside BufferDesc.state
> > 5) Do IO while holding share-exclusive lock and require all buffer
> > modifications to at least hold share exclusive lock
> > 6) Wait for AIO when acquiring an exclusive content lock
>
> No strong objections. I certainly like getting to (5) and (6) and I
> think those are in the right order. I'm not sure about the rest.

> I thought (1) and (2) were the same change after reading your email

They are certainly related. I thought it'd make sense to split them as
outlined above, as (1) is relatively verbose on its own, but far more
mechanical.

> and it surprises me a little bit that (2) is separate from (4).

Without doing 2) first, I see performance/scalability regressions doing
(4). Doing (3) without (2) also hurts...

> > DOES ANYBODY HAVE A BETTER NAME THAN SHARE-EXCLUSIVE???!?
>
> AFAIK "share exclusive" or "SX" is standard terminology. While I'm not
> wholly hostile to the idea of coming up with something else, I don't
> think our tendency to invent our own way to do everything is one of
> our better tendencies as a project.

I guess it bothers me that we'd use share-exclusive to mean the buffer can't
be modified, but for real (vs share, which does allow some modifications). But
it's very well plausible that there's no meaningfully better name, in which
case we certainly shouldn't differ from what's somewhat commonly used.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2025-08-26 21:18:05 Re: pgsql: oauth: Add unit tests for multiplexer handling
Previous Message Christoph Berg 2025-08-26 20:44:57 Re: pgsql: oauth: Add unit tests for multiplexer handling