Re: Buffer locking is special (hints, checksums, AIO writes)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: Buffer locking is special (hints, checksums, AIO writes)
Date: 2026-03-11 22:40:41
Message-ID: mheeefrtikvgjnjsenocvo3afj7vlpr5rljkzevssxs77n2zdt@sugqtjhryydo
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-02-15 11:52:39 -0800, Noah Misch wrote:
> On Sat, Feb 07, 2026 at 12:44:25PM +0200, Heikki Linnakangas wrote:
> > On 03/02/2026 00:33, Andres Freund wrote:
> > > - The way MarkBufferDirtyHint() operates was copied into
> > > heap_inplace_update_and_unlock(). Now that MarkBufferDirtyHint() won't work
> > > that way anymore, it seems better to go with the alternative approach the
> > > comments already outlined, namely to only delay updating of the buffer
> > > contents.
> > >
> > > I've done this in a prequisite commit, as it doesn't actually depend on any
> > > of the other changes. Noah, any chance you could take a look at this?
>
> v12-0001-heapam-Don-t-mimic-MarkBufferDirtyHint-in-inplac.patch looks good.

> > How about this:
> >
> > * We avoid that by using a temporary copy of the buffer to hide our
> > * change from other backends until it's been WAL-logged. We apply our
> > * change to the temporary copy and WAL-log it before modifying the real
> > * page. That way any action a reader of the in-place-updated value takes
> > * will be WAL logged after this change.
>
> Either v12 or v12 w/ this edit is fine with me. I find this proposed text
> redundant with nearby comment "register block matching what buffer will look
> like after changes", so I mildly prefer v12.

Thanks for the review!

I pushed this and many of the later patches in the series. Here are updated
versions of the remaining changes. The last two previously were one commit
with "WIP" in the title. The first one has, I think, not had a lot of review -
but it's also not a complicated change.

I see decent performance improvements with a fully s_b resident pipelined
pgbench -S with 0002+0003, ~7-8% on an older small two socket machine.

The improvement is just from reducing the number of atomic operations on
contended cachelines (i.e. inner btree pages).

Without pipelining the difference is smaller (1-2%), because of the context
switches are the bigger bottleneck.

More extreme worloads involving an index nested loop join benefit
more. E.g. the setup and query from
https://anarazel.de/talks/2024-05-29-pgconf-dev-c2c/postgres-perf-c2c.pdf
slide 23, show a 25% improvement on the same 2 socket machine.

We could probably do something similar for the also very common combination of
PinBuffer() + LockBuffer(), but I think it'd be a fair bit more complicated,
and would require new APIs, rather than just using existing APIs more widely.

Greetings,

Andres Freund

Attachment Content-Type Size
v13-0001-bufmgr-Don-t-copy-pages-while-writing-out.patch text/x-diff 10.9 KB
v13-0002-Use-UnlockReleaseBuffer-in-more-places.patch text/x-diff 7.8 KB
v13-0003-bufmgr-Make-UnlockReleaseBuffer-more-efficient.patch text/x-diff 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2026-03-11 22:41:10 Re: Odd code around ginScanToDelete
Previous Message Zsolt Parragi 2026-03-11 22:39:58 Re: Row pattern recognition