Re: Redesigning postmaster death handling

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Redesigning postmaster death handling
Date: 2025-08-21 05:28:04
Message-ID: 1375963.1755754084@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> Here's an experimental patch to fix our shutdown strategy on
> postmaster death, as discussed in a nearby report[1].

Thanks for tackling this topic.

> For systems lacking that facility, the idea I'm trying out here is
> that backends that detect the condition in WaitEventSetWait() should
> themselves blast all backends with SIGQUIT, in a sense taking over the
> role of the departed postmaster.

Hmm. Up to now, we have not had an assumption that postmaster
children are aware of every other postmaster child. In particular,
not all postmaster children have PGPROC entries. How much does
this matter? What happens if the shared PGPROC array is corrupt?

> I didn't really want any
> consensus/negotiation over who's going to do that, so... they all do.

Agreed on that point.

> Most of the patch is just removing hundreds of lines of errors and
> conditions and comments that were now unreachable.

The patch would likely be a lot more readable if you split out the
"delete unreachable code" part into a separate step.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2025-08-21 05:41:09 RE: memory leak in logical WAL sender with pgoutput's cachectx
Previous Message Hayato Kuroda (Fujitsu) 2025-08-21 05:27:03 RE: memory leak in logical WAL sender with pgoutput's cachectx