Re: Issue with the PRNG used by Postgres

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Parag Paul <parag(dot)paul(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Issue with the PRNG used by Postgres
Date: 2024-04-12 03:15:38
Message-ID: 325444.1712891738@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> ... But Robert's question remains: how does
> PANIC'ing after awhile make anything better? I flat out don't
> believe the idea that having a backend stuck on a spinlock
> would otherwise go undetected.

Oh, wait. After thinking a bit longer I believe I recall the argument
for this behavior: it automates recovery from a genuinely stuck
spinlock. If we waited forever, the only way out of that is for a
DBA to kill -9 the stuck process, which has exactly the same end
result as a PANIC, except that it takes a lot longer to put the system
back in service and perhaps rousts somebody, or several somebodies,
out of their warm beds to fix it. If you don't have a DBA on-call
24x7 then that answer looks even worse.

So there's that. But that's not an argument that we need to be in a
hurry to timeout; if the built-in reaction time is less than perhaps
10 minutes you're still miles ahead of the manual solution.

On the third hand, it's still true that we have no comparable
behavior for any other source of system lockups, and it's difficult
to make a case that stuck spinlocks really need more concern than
other kinds of bugs.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2024-04-12 03:18:06 Re: BitmapHeapScan streaming read user and prelim refactoring
Previous Message David Steele 2024-04-12 03:11:09 Re: post-freeze damage control