Re: VM corruption on standby

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: VM corruption on standby
Date: 2025-08-19 13:56:27
Message-ID: s4lymuioguv4ir75jqqkl5taos2ft6aps2enjlwnphgb5loihq@llhtt5gkbjpb
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-08-19 02:13:43 -0400, Tom Lane wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > On Tue, Aug 19, 2025 at 4:52 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> But I'm of the opinion that proc_exit
> >> is the wrong thing to use after seeing postmaster death, critical
> >> section or no. We should assume that system integrity is already
> >> compromised, and get out as fast as we can with as few side-effects
> >> as possible. It'll be up to the next generation of postmaster to
> >> try to clean up.
>
> > Then wouldn't backends blocked in LWLockAcquire(x) hang forever, after
> > someone who holds x calls _exit()?
>
> If someone who holds x is killed by (say) the OOM killer, how do
> we get out of that?

On linux - the primary OS with OOM killer troubles - I'm pretty sure'll lwlock
waiters would get killed due to the postmaster death signal we've configured
(c.f. PostmasterDeathSignalInit()).

A long while back I had experimented with replacing waiting on semaphores
(within lwlocks) with a latch wait. IIRC it was a bit slower under heavy
contention, but that vanished when adding some adaptive spinning to lwlocks -
which is also what we need to make it more feasible to replace some of the
remaining spinlocks...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2025-08-19 14:08:21 UPDATE with invalid domain constraint
Previous Message Florents Tselai 2025-08-19 13:52:24 Re: mention unused_oids script in pg_proc.dat