Re: Why does logical replication launcher exit with exit code 1?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why does logical replication launcher exit with exit code 1?
Date: 2017-08-02 00:20:28
Message-ID: 20170802002028.srtnekv4qirzob6c@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-08-02 12:14:18 +1200, Thomas Munro wrote:
> On Wed, Aug 2, 2017 at 11:03 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2017-08-02 10:58:32 +1200, Thomas Munro wrote:
> >> When I shut down a cluster that isn't using logical replication, it
> >> always logs a line like the following. So do the build farm members I
> >> looked at. I didn't see anything about this in the open items list --
> >> isn't it a bug?
> >>
> >> 2017-08-02 10:39:25.007 NZST [34781] LOG: worker process: logical
> >> replication launcher (PID 34788) exited with exit code 1
> >
> > Exit code 0 signals that a worker should be restarted. Therefore
> > graceful exit can't really use that. I think a) we really need to
> > improve bgworker infrastructure around that b) shows the limit of using
> > bgworkers for this kinda thing - we should probably have a more bgworker
> > like infrastructure for internal workers.
>
> I see. In the meantime IMHO I think we should try to find a way to
> avoid printing out this message -- it looks like something is wrong to
> the uninitiated.

Well, that's how it is for all bgworkers - maybe a better solution is to
adjust that message in the postmaster rather than fiddle with the worker
exist code? Seems like we could easily take pmStatus into account
inside LogChildExit() and set the log level to DEBUG1 even for
EXIT_STATUS_1 in that case? Additionally we probably should always log
a better message for bgworkers exiting with exit 1, something about
unregistering the worker or such.

> Possibly stupid question: why do we restart workers when we know we're
> shutting down anyway? Hmm, I suppose there might conceivably be
> workers that need to do something during shutdown and they might not
> have done it yet.

The launcher doesn't really know the reason for the shutdown.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-08-02 00:31:11 Re: Partitioning vs ON CONFLICT
Previous Message Thomas Munro 2017-08-02 00:14:18 Re: Why does logical replication launcher exit with exit code 1?