Re: Buffer locking is special (hints, checksums, AIO writes)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: Buffer locking is special (hints, checksums, AIO writes)
Date: 2026-03-25 22:35:55
Message-ID: yn4f45z6wl6wstuwha3qjsicrc6uvymkldtqz2ttchz7zdue3y@n6fll5dcter4
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-03-25 17:34:33 -0400, Melanie Plageman wrote:
> On Wed, Mar 11, 2026 at 6:40 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > I pushed this and many of the later patches in the series. Here are updated
> > versions of the remaining changes. The last two previously were one commit
> > with "WIP" in the title. The first one has, I think, not had a lot of review -
> > but it's also not a complicated change.
>
> 0001 looks good except for the comment above PageSetChecksum() that
> says it is only for shared buffers and a stray reference to the
> no-longer-present bufToWrite variable in a comment around line 4490 in
> bufmgr.c

Thanks for catching these.

Updated the PageSetChecksum() comment to

* Set checksum on a page.
*
* If the page is in shared buffers, it needs to be locked in at least
* share-exclusive mode.
...

> 0002
> diff --git a/src/backend/access/nbtree/nbtpage.c
> b/src/backend/access/nbtree/nbtpage.c
> index cc9c45dc40c..ad700e590e8 100644
> --- a/src/backend/access/nbtree/nbtpage.c
> +++ b/src/backend/access/nbtree/nbtpage.c
> @@ -1011,24 +1011,48 @@ _bt_relandgetbuf(Relation rel, Buffer obuf,
> BlockNumber blkno, int access)
> Assert(BlockNumberIsValid(blkno));
> if (BufferIsValid(obuf))
> - _bt_unlockbuf(rel, obuf);
> - buf = ReleaseAndReadBuffer(obuf, rel, blkno);
> - _bt_lockbuf(rel, buf, access);
> + {
> + if (BufferGetBlockNumber(obuf) == blkno)
> + {
> + /* trade in old lock mode for new lock */
> + _bt_unlockbuf(rel, obuf);
> + buf = obuf;
> + }
> + else
> + {
> + /* release lock and pin at once, that's a bit more efficient */
> + _bt_relbuf(rel, obuf);
> + buf = ReadBuffer(rel, blkno);
> + }
> + }
> + else
> + buf = ReadBuffer(rel, blkno);
>
> Not related to this patch, but why do we unlock and relock it when
> obuf has the block we need? Couldn't we pass lock mode and then just
> do nothing if it is the right lockmode?

I think it's very unlikely that it's called at any frequency with the same
buffer and lockmode. What would be the point of calling _bt_relandgetbuf() if
that's the case.

> Setting that aside, I presume we don't need to check the fork and
> relfilelocator (as ReleaseAndReadBuffer() did) because this code knows
> it will be the same?

Yea, it's a single index, so there can't be a different relfilenode.

> 0003
> AFAICT, this does what you claim. I don't really know what else to
> look when reviewing it, if I'm being honest. As such, I diligently fed
> it through AI which suggested you may have lost a
> VALGRIND_MAKE_MEM_NOACCESS(BufHdrGetBlock(buf), BLCKSZ);
> which sounds right to me and like something you should fix.

Good catch Melai.

> Also, I'd say this comment
> + /*
> + * Now okay to allow cancel/die interrupts again, were held when the lock
> + * was acquired.
> + */
>
> needs a "which" after the comma to read smoothly.

Fixed.

Running it through valgrind and then will work on reading through one more
time and pushing them.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2026-03-25 22:42:01 Re: SQL-level pg_datum_image_equal
Previous Message SATYANARAYANA NARLAPURAM 2026-03-25 22:34:13 Re: LockHasWaiters() crashes on fast-path locks