Re: PG in container w/ pid namespace is init, process exits cause restart

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PG in container w/ pid namespace is init, process exits cause restart
Date: 2021-05-03 19:25:53
Message-ID: 20210503192553.GA10866@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-May-03, Andres Freund wrote:

> The issue turns out to be that postgres was in a container, with pid
> namespaces enabled. Because postgres was run directly in the container,
> without a parent process inside, it thus becomes pid 1. Which mostly
> works without a problem. Until, as the case here with the archive
> command, a sub-sub process exits while it still has a child. Then that
> child gets re-parented to postmaster (as init).

Hah .. interesting. I think we should definitely make this work, since
containerized stuff is going to become more and more prevalent.

I also heard a story where things ran into trouble (I didn't get the
whole story of *what* was the problem with that) because the datadir is /.
I know -- nobody in their right mind would put the datadir in / -- but
apparently in the container world that's not something as stupid as it
sounds. That's of course not related to what you describe here
code-wise, but the underlying reason is the same.

> I wonder if we should work a bit harder to try to identify whether an
> exiting process was a "server process" before identifying it as such?

Well, we've never made any effort there because it just wasn't possible.
Nobody ever had postmaster also be init .. until containers. Let's fix
it.

> And perhaps we ought to warn about postgres running as "init" unless we
> make that robust?

I guess we can do that in older releases, but do we really need it? As
I understand, the only thing we need to do is verify that the dying PID
is a backend PID, and not cause a crash cycle if it isn't.

--
Álvaro Herrera Valdivia, Chile

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2021-05-03 19:31:27 Re: Regex performance regression induced by match-all code
Previous Message Stephen Frost 2021-05-03 19:12:56 Re: Granting control of SUSET gucs to non-superusers