Re: BUG #15804: Assertion failure when using logging_collector with EXEC_BACKEND

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Yuli Khodorkovskiy <yuli(dot)khodorkovskiy(at)crunchydata(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15804: Assertion failure when using logging_collector with EXEC_BACKEND
Date: 2019-05-20 15:01:05
Message-ID: 7927.1558364465@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I wrote:
> This line of thought suggests that trying to fix things so that
> we can launch child processes before creating shared memory
> is the wrong thing, because it seriously risks creating problems
> in the leftover-child-processes scenario.

> This means that the change that 57431a911 wanted to make is only
> going to be safe if we're willing to re-order things so that the
> startup sequence is

> * create datadir lock file
> * create shmem
> * launch syslogger
> * create sockets

In other words, the right way to think about this is less "move
syslogger launch to earlier" and more "move port opening to later".

I did some cursory testing of that idea with the attached patch,
which simply relocates the port opening logic to below where
syslogger start is (though "git diff" insists on presenting it
differently :-(). I also moved and recommented the emission
of the "starting ..." log entry. It works under EXEC_BACKEND,
but I'm not fool enough to believe that that proves it works
under Windows :-(.

One issue with this is that we can't be sure we have sole control
of the postmaster port number at the time we create shmem.
Hence, to avoid undesirable conflicts of shmem, we should change
things to base the shmem key on the datadir's ID not the port
number, as was already speculated about in
https://postgr.es/m/16908.1557521200@sss.pgh.pa.us

Also, this will change the order in which entries get made into
postmaster.pid. I think that's OK, but we'll need to take a
close look at pg_ctl to be sure it isn't making any invalid
assumptions.

Another point is that we want to be sure this doesn't change
the order in which lockfiles are released at shutdown. That
seems OK (I confirmed by strace'ing that the postmaster's
final syscalls are still done in the same order) but it could
use some additional eyeballs on it.

There may be some other reorderings that would be a good idea.
In particular I'm thinking that the CreateOptsFile call should
be pushed down, so that it doesn't get written until we know
that the port number is OK.

regards, tom lane

Attachment Content-Type Size
postpone-port-opening-1.patch text/x-diff 9.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bossart, Nathan 2019-05-20 22:37:50 Re: BUG #15788: 'pg_dump --create' orders database GRANTs incorrectly
Previous Message Christoph Berg 2019-05-20 12:04:44 Re: problem with latin09 encoding after upgrade to 11.3