| From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: GetBufferDescriptor() being called for local buffers from MarkBufferDirtyHint() |
| Date: | 2026-06-10 10:50:18 |
| Message-ID: | CAExHW5uNjWmqmtXPTYJOA3n9EdyhPh1oWuA3ywvvaOUskjQRFw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Jun 10, 2026 at 9:10 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Sat, Jun 06, 2026 at 01:37:42PM +0530, Ashutosh Bapat wrote:
> > 82467f627bd478569de04f4a3f1993098e80c812 added MarkBufferDirtyHint()
> > which invokes GetBufferDescriptor() even for local buffers for which
> > id < 0. Since GetBufferDescriptor() declares id as uint32, -1 is
> > converted to a very large int32 value which is way larger than
> > NBuffers. Thus GetBufferDescriptor() may be returning something from
> > the BufferBlocks which probably has enough memory to accommodate that
> > memory access. But it's a bogus BufferDesc nevertheless. We are not
> > seeing any problem with this right now since MarkBufferDirtyHint()
> > uses the BufferDesc only when it's a shared buffer. Right fix is to
> > let that function handle local buffers first and then call
> > GetBufferDescriptor() as in the attached patch.
>
> @@ -5831,8 +5831,6 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
> {
> BufferDesc *bufHdr;
>
> - bufHdr = GetBufferDescriptor(buffer - 1);
> -
> if (!BufferIsValid(buffer))
> elog(ERROR, "bad buffer ID: %d", buffer);
>
> @@ -5842,6 +5840,8 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
> return;
> }
>
> + bufHdr = GetBufferDescriptor(buffer - 1);
>
> Yep, that's clearly wrong. We are lucky that it does not blow up
> today but that's a ticking bomb. Even with that in mind, the result
> leads to a non-defined behavior.
>
> - return &(BufferDescriptors[id]).bufferdesc;
> + BufferDesc *bdesc = &(BufferDescriptors[id]).bufferdesc;
> +
> + Assert(bdesc->buf_id == id);
> +
> + return bdesc;
> [...]
> - BufferDesc *buf = GetBufferDescriptor(i);
> + /* GetBufferDescriptor() expects buf_id to be set in the descriptor. */
> + BufferDesc *buf = &(BufferDescriptors[i]).bufferdesc;
>
> I am not convinced by these two. For the first one, the assertion is
> a bound check in disguise, which could also use NBuffers as of an (id
> < NBuffers).
Just to clarify, it's stronger than just a bound check.
BufferDesc::buf_id comment says
/*
* Buffer's index number (from 0). The field never changes after
* initialization, so does not need locking.
*/
> With the inlining we care about the performance. This
> also forces the second change in ShmemInit(), which makes even less
> sense once we think about the first assertion.
>
The field is writable and can be accidentally written to. Imagine a
bug *(wrong memory calculation) = <wrong buffer id> changing buf_id
without realising the change especially when <wrong buffer id> is
within 0 to NBuffers. There is no place where we check that the
assumption is true. But we use buf_id in a few critical places.
GetBufferDescriptor() is a good place for checking the assumption. I
agree that there's a small risk that it will make accessing the buffer
slower but only in Assert enabled builds which are anyway slower [1].
FWIW, I ran pgbench on my laptop with nproc = 16 with scale = 10. I
didn't find any performance difference
Without assertion
$ for i in 1 2 3 4 5 6; do pgbench -d postgres -T 300 -c 4 -j 4 | grep tps; done
starting vacuum...end.
tps = 1081.714940 (without initial connection time)
starting vacuum...end.
tps = 1026.242729 (without initial connection time)
starting vacuum...end.
tps = 1080.696554 (without initial connection time)
starting vacuum...end.
tps = 1092.211269 (without initial connection time)
starting vacuum...end.
tps = 1058.359337 (without initial connection time)
starting vacuum...end.
tps = 1087.718979 (without initial connection time)
With Assertion
$ for i in 1 2 3 4 5 6 ; do pgbench -d postgres -T 300 -c 4 -j 4 |
grep tps; done
starting vacuum...end.
tps = 1065.345204 (without initial connection time)
starting vacuum...end.
tps = 1088.642703 (without initial connection time)
starting vacuum...end.
tps = 1077.954350 (without initial connection time)
starting vacuum...end.
tps = 1091.662205 (without initial connection time)
starting vacuum...end.
tps = 1029.412347 (without initial connection time)
starting vacuum...end.
tps = 1086.912667 (without initial connection time)
Said that, I am fine, if we don't want to include the Assertion. I
will keep it in my shared buffer resizing patches where I have found
it useful. Maybe we will reconsider it with resizable shared buffers
context.
> Will fix the first issue on HEAD, that's wrong. Thanks for the
> report.
Thanks.
--
Best Wishes,
Ashutosh Bapat
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Bapat | 2026-06-10 11:00:24 | Re: Fix DROP PROPERTY GRAPH "unsupported object class" error |
| Previous Message | Marko Grujic | 2026-06-10 10:48:20 | [PATCH v1] [BUG #19516] Skip whole-row projection shortcut for OLD/NEW returning type |