Re: BUG #15804: Assertion failure when using logging_collector with EXEC_BACKEND

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Yuli Khodorkovskiy <yuli(dot)khodorkovskiy(at)crunchydata(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: BUG #15804: Assertion failure when using logging_collector with EXEC_BACKEND
Date: 2019-05-18 17:07:15
Message-ID: 31224.1558199235@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

[ some notes for whoever tries to fix this in the future ]

Michael Paquier <michael(at)paquier(dot)xyz> writes:
> Trying to avoid calling pgwin32_ReserveSharedMemoryRegion() for
> processes which are never going to connect to shared memory is
> actually an interesting option, because this way there is no need to
> relax the assertions it includes,

Yeah, I'd come to the same conclusion while thinking about this,
but I don't think you're going about it in quite the right way.

In the first place, it's a mistake to try to drive the what-to-do-about-
shared-memory decision strictly off the type of the subprocess. In the
case of the syslogger, in particular, we have not only the initial launch
to think about but also a possible relaunch if the first syslogger child
crashes. In that scenario there *would* be shared memory to detach from.

So the sketch that's emerging in my mind is

1. The postmaster should have its own state variable, say
"static bool shared_memory_exists", which it sets at the appropriate
time. (We could rely on testing UsedShmemSegAddr for that, but I don't
think that'd be good design; UsedShmemSegAddr is really private state of
the particular shmem implementation.) Since internal_forkexec is
inside postmaster.c and runs in the postmaster process, it could just
test that state variable directly to see if it needs to call
pgwin32_ReserveSharedMemoryRegion.

2. I think we'd be well advised to get rid of the process-type-specific
tests in SubPostmasterMain, too. The idea in my mind is to have the
postmaster pass down an additional switch explicitly saying what to do
about shared memory, so that what SubPostmasterMain does is roughly

if (strcmp(argv[2], "--reattach") == 0)
PGSharedMemoryReAttach();
else if (strcmp(argv[2], "--noreattach") == 0)
PGSharedMemoryNoReAttach();
else
/* do nothing, shared memory does not exist yet */ ;

Then the problem for the syslogger is solved by passing --noreattach
or not depending on whether shared_memory_exists is true yet.

I've not worked that out in any detail; getting the switches included
properly might be too much of a pain for this to be an improvement.
But for sure I don't want there to be multiple copies of that list of
which subprocess types use shared memory.

A variant idea is to include shared_memory_exists in what's passed down
by the BackendParameters mechanism, and then subprocesses can adjust
their behavior for themselves.

In any case, the thrust of all of this is that we shouldn't touch any
of the assertions in the shmem support files; rather, the way to fix
it is to improve the logic in postmaster.c so that we don't call those
functions at all in child processes that are launched before shmem
exists.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2019-05-19 00:43:45 Re: BUG #15804: Assertion failure when using logging_collector with EXEC_BACKEND
Previous Message Tom Lane 2019-05-18 16:18:28 Re: BUG #15804: Assertion failure when using logging_collector with EXEC_BACKEND