Re: GetBufferDescriptor() being called for local buffers from MarkBufferDirtyHint()

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: GetBufferDescriptor() being called for local buffers from MarkBufferDirtyHint()
Date: 2026-06-10 10:50:18
Message-ID: CAExHW5uNjWmqmtXPTYJOA3n9EdyhPh1oWuA3ywvvaOUskjQRFw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 10, 2026 at 9:10 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Sat, Jun 06, 2026 at 01:37:42PM +0530, Ashutosh Bapat wrote:
> > 82467f627bd478569de04f4a3f1993098e80c812 added MarkBufferDirtyHint()
> > which invokes GetBufferDescriptor() even for local buffers for which
> > id < 0. Since GetBufferDescriptor() declares id as uint32, -1 is
> > converted to a very large int32 value which is way larger than
> > NBuffers. Thus GetBufferDescriptor() may be returning something from
> > the BufferBlocks which probably has enough memory to accommodate that
> > memory access. But it's a bogus BufferDesc nevertheless. We are not
> > seeing any problem with this right now since MarkBufferDirtyHint()
> > uses the BufferDesc only when it's a shared buffer. Right fix is to
> > let that function handle local buffers first and then call
> > GetBufferDescriptor() as in the attached patch.
>
> @@ -5831,8 +5831,6 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
> {
> BufferDesc *bufHdr;
>
> - bufHdr = GetBufferDescriptor(buffer - 1);
> -
> if (!BufferIsValid(buffer))
> elog(ERROR, "bad buffer ID: %d", buffer);
>
> @@ -5842,6 +5840,8 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
> return;
> }
>
> + bufHdr = GetBufferDescriptor(buffer - 1);
>
> Yep, that's clearly wrong. We are lucky that it does not blow up
> today but that's a ticking bomb. Even with that in mind, the result
> leads to a non-defined behavior.
>
> - return &(BufferDescriptors[id]).bufferdesc;
> + BufferDesc *bdesc = &(BufferDescriptors[id]).bufferdesc;
> +
> + Assert(bdesc->buf_id == id);
> +
> + return bdesc;
> [...]
> - BufferDesc *buf = GetBufferDescriptor(i);
> + /* GetBufferDescriptor() expects buf_id to be set in the descriptor. */
> + BufferDesc *buf = &(BufferDescriptors[i]).bufferdesc;
>
> I am not convinced by these two. For the first one, the assertion is
> a bound check in disguise, which could also use NBuffers as of an (id
> < NBuffers).

Just to clarify, it's stronger than just a bound check.
BufferDesc::buf_id comment says
/*
* Buffer's index number (from 0). The field never changes after
* initialization, so does not need locking.
*/
> With the inlining we care about the performance. This
> also forces the second change in ShmemInit(), which makes even less
> sense once we think about the first assertion.
>

The field is writable and can be accidentally written to. Imagine a
bug *(wrong memory calculation) = <wrong buffer id> changing buf_id
without realising the change especially when <wrong buffer id> is
within 0 to NBuffers. There is no place where we check that the
assumption is true. But we use buf_id in a few critical places.
GetBufferDescriptor() is a good place for checking the assumption. I
agree that there's a small risk that it will make accessing the buffer
slower but only in Assert enabled builds which are anyway slower [1].

FWIW, I ran pgbench on my laptop with nproc = 16 with scale = 10. I
didn't find any performance difference

Without assertion
$ for i in 1 2 3 4 5 6; do pgbench -d postgres -T 300 -c 4 -j 4 | grep tps; done
starting vacuum...end.
tps = 1081.714940 (without initial connection time)
starting vacuum...end.
tps = 1026.242729 (without initial connection time)
starting vacuum...end.
tps = 1080.696554 (without initial connection time)
starting vacuum...end.
tps = 1092.211269 (without initial connection time)
starting vacuum...end.
tps = 1058.359337 (without initial connection time)
starting vacuum...end.
tps = 1087.718979 (without initial connection time)

With Assertion
$ for i in 1 2 3 4 5 6 ; do pgbench -d postgres -T 300 -c 4 -j 4 |
grep tps; done
starting vacuum...end.
tps = 1065.345204 (without initial connection time)
starting vacuum...end.
tps = 1088.642703 (without initial connection time)
starting vacuum...end.
tps = 1077.954350 (without initial connection time)
starting vacuum...end.
tps = 1091.662205 (without initial connection time)
starting vacuum...end.
tps = 1029.412347 (without initial connection time)
starting vacuum...end.
tps = 1086.912667 (without initial connection time)

Said that, I am fine, if we don't want to include the Assertion. I
will keep it in my shared buffer resizing patches where I have found
it useful. Maybe we will reconsider it with resizable shared buffers
context.

> Will fix the first issue on HEAD, that's wrong. Thanks for the
> report.

Thanks.

[1] https://www.postgresql.org/message-id/CAExHW5tNMNQ5ennUM3QK%2BMQGw703M9T6z-LvBKKAcgpJ3KW14Q%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2026-06-10 11:00:24 Re: Fix DROP PROPERTY GRAPH "unsupported object class" error
Previous Message Marko Grujic 2026-06-10 10:48:20 [PATCH v1] [BUG #19516] Skip whole-row projection shortcut for OLD/NEW returning type