Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds)

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds)
Date: 2021-03-12 06:55:59
Message-ID: CA+hUKG+O9c=sAL-1Oa8axtxThBLsVNbA134CyAw2VtpWd8dkYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 11, 2021 at 7:34 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Thu, Mar 11, 2021 at 04:37:39PM +1300, Thomas Munro wrote:
> > Michael, when you said "That's pretty hack-ish, still efficient" in
> > reference to this code:
> >
> >> - if (IsUnderPostmaster && !PostmasterIsAlive())
> >> + if (IsUnderPostmaster &&
> >> +#ifndef USE_POSTMASTER_DEATH_SIGNAL
> >> + count++ % 1024 == 0 &&
> >> +#endif
> >> + !PostmasterIsAlive())
> >
> > Is that an objection, and do you see a specific better way?
>
> I'd like to believe that there are more elegant ways to write that,
> but based on the numbers you are giving, there is too much gain here
> to ignore it. I would avoid 1024 as a hardcoded value though, so you
> could just stick that in a #define or such. So please feel free to go
> ahead. Thanks for asking.

Thanks! I rebased over the recent recovery pause/resume state
management change and simplified the walRcvState patch a bit (no need
to broadcast for every state change, just the changes to STOPPED
state). So that gets us to the point where there are no loops with
HandleStartupProcInterrupts() and a sleep in them (that'd be bad, it'd
take a long time to notice the postmaster going away if it only checks
every 1024 loops; all loops that sleep need to be using the latch
infrastructure so they can notice the postmaster exiting immediately).
Then I turned that 1024 into a macro as you suggested for the main
patch, and pushed.

It looks like RecoveryRequiresIntParameter() should be sharing code
with recoveryPausesHere(), but I didn't try to do that in this commit.

> > I know that someone just needs to write a Windows patch to get us a
> > postmaster death signal when the postmaster's event fires, and then
> > the problem will go away on Windows. I still want this change,
> > because we don't have such a patch yet, and even when someone writes
> > that, there are still a couple of Unixes that could benefit.
>
> Wow. This probably means that we would be able to get rid of
> USE_POSTMASTER_DEATH_SIGNAL?

We'll still need it, because there'd still be systems with no signal:
NetBSD, OpenBSD, AIX, HPUX, illumos.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2021-03-12 07:05:09 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message k.jamison@fujitsu.com 2021-03-12 06:55:49 RE: [Patch] Optimize dropping of relation buffers using dlist