Re: Excessive PostmasterIsAlive calls slow down WAL redo

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Excessive PostmasterIsAlive calls slow down WAL redo
Date: 2018-07-10 11:39:38
Message-ID: e30fc40c-21b7-fe26-6dc3-3326f3f346e1@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27/06/18 08:26, Thomas Munro wrote:
> On Wed, Apr 25, 2018 at 6:23 PM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> On Tue, Apr 24, 2018 at 7:37 PM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>>> Thomas, trying to understand here... Why this place for the signal
>>> initialization? Wouldn't InitPostmasterChild() be a more logical place
>>> as we'd want to have this logic caught by all other processes?
>>
>> Yeah, you're right. Here's one like that.
>
> Amazingly, due to the way the project schedules fell and thanks to the
> help of a couple of very supportive FreeBSD committers and reviewers,
> the new PROC_PDEATHSIG_CTL kernel facility[1] landed in a production
> release today, beating the Commitfest by several days.
>
> My question is whether this patch set is enough, or people (Andres?) want
> to go further and actually kill the postmaster death pipe completely.
> My kqueue patch would almost completely kill it on BSDs as it happens,
> but for Linux there are a number of races and complications to plug to
> do that IIUC. I don't immediately see what there is to gain by doing
> that work (assuming we reuse WaitEventSet objects in enough places),
> and we'll always need to maintain the pipe code for other OSes anyway.
> So I'm -0.5 on doing that, even though the technical puzzle is
> appealing...
>
> [1] https://www.freebsd.org/cgi/man.cgi?query=procctl&apropos=0&sektion=2&manpath=FreeBSD+11.2-RELEASE&arch=default&format=html

Yeah, I don't think we can kill the pipe completely. As long as we still
need it for the other OSes, I don't see much point in eliminating it on
Linux and BSDs either. I'd rather keep the code similar across platforms.

Looking at the patch:

The 'postmaster_possibly_dead' flag is not reset anywhere. So if a
process receives a spurious death signal, even though postmaster is
still alive, PostmasterIsAlive() will continue to use the slow path.

postmaster_possibly_dead needs to be marked as 'volatile', no?

The autoconf check for PR_SET_PDEATHSIG seems slightly misplaced. And I
think we can simplify it with AC_CHECK_HEADER(). I'd also like to avoid
adding code to c.h for this, that seems too global.

After some kibitzing, I ended up with the attached. It fixes the
postmaster_possible_dead issues mentioned above, and moves around the
autoconf and #ifdef logic a bit to make it a bit nicer, at least in my
opinion. I don't have a FreeBSD machine at hand, so I didn't try fixing
that patch.

- Heikki

Attachment Content-Type Size
0001-Use-signals-for-postmaster-death-on-Linux-2.patch text/x-patch 11.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2018-07-10 12:07:40 Re: shared-memory based stats collector
Previous Message Alexander Korotkov 2018-07-10 11:36:16 Re: Locking B-tree leafs immediately in exclusive mode