Re: Issue with the PRNG used by Postgres

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Parag Paul <parag(dot)paul(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Issue with the PRNG used by Postgres
Date: 2024-04-12 16:00:00
Message-ID: 284660e9-e930-963d-71f6-ca693f04dac0@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

12.04.2024 08:05, Alexander Lakhin wrote:
> 2024-04-12 05:00:17.981 UTC [762336] PANIC:  stuck spinlock detected at WaitBufHdrUnlocked, bufmgr.c:5726
>

It looks like that spinlock issue caused by a race condition/deadlock.
What I see when the test fails is:
A client backend executing "DROP DATABASE conflict_db" performs
dropdb() -> DropDatabaseBuffers() -> InvalidateBuffer()
At the same time, bgwriter performs (for the same buffer):
BgBufferSync() -> SyncOneBuffer()

When InvalidateBuffer() is called, the buffer refcount is zero,
then bgwriter pins the buffer, thus increases refcount;
InvalidateBuffer() gets into the retry loop;
bgwriter calls UnpinBuffer() -> UnpinBufferNoOwner() ->
  WaitBufHdrUnlocked(), which waits for !BM_LOCKED state,
while InvalidateBuffer() waits for the buffer refcount decrease.

As it turns out, it's not related to spinlocks' specifics or PRNG, just a
serendipitous find.

Best regards,
Alexander

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-04-12 16:00:11 Re: Security lessons from liblzma - libsystemd
Previous Message Andres Freund 2024-04-12 15:45:27 Re: Issue with the PRNG used by Postgres