Re: Get stuck when dropping a subscription during synchronizing table

From: Andres Freund <andres(at)anarazel(dot)de>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Get stuck when dropping a subscription during synchronizing table
Date: 2017-06-07 01:00:14
Message-ID: 20170607010014.rz6nfusdav3uwb5o@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-06-06 19:36:13 +0200, Petr Jelinek wrote:
> So the fact that we moved workers to standard interrupt handling broke
> launcher in subtle ways because it still uses it's own SIGTERM handling
> but some function it calls are using CHECK_FOR_INTERRUPTS (they are used
> by worker as well). I think we need to move launcher to standard
> interrupt handling as well.

Sounds like a good plan.

> As a side note, we are starting to have several IsSomeTypeOfProcess
> functions for these kind of things. I wonder if bgworker infrastructure
> should somehow provide this type of stuff (the proposed bgw_type might
> help there) as well as maybe being able to register interrupt handler
> for the worker (it's really hard to do it via custom SIGTERM handler as
> visible on worker, launcher and walsender issues we are fixing).
> Obviously this is PG11 related thought.

Maybe it's also a sign that the bgworker infrastructure isn't really the
best thing to use for internal processes like parallel workers and lrep
processes - it's really built so core code *doesn't* know anything
specific about them, which isn't really what we want in some of these
cases. That's not to say that bgworkers and parallelism/lrep shouldn't
share infrastructure, don't get me wrong.

>
> - LogicalRepCtx->launcher_pid = 0;
> -
> - /* ... and if it returns, we're done */
> - ereport(DEBUG1,
> - (errmsg("logical replication launcher shutting down")));
> + /* Not reachable */
> +}

Maybe put a pg_unreachable() there?

> @@ -2848,6 +2849,14 @@ ProcessInterrupts(void)
> ereport(FATAL,
> (errcode(ERRCODE_ADMIN_SHUTDOWN),
> errmsg("terminating logical replication worker due to administrator command")));
> + else if (IsLogicalLauncher())
> + {
> + ereport(DEBUG1,
> + (errmsg("logical replication launcher shutting down")));
> +
> + /* The logical replication launcher can be stopped at any time. */
> + proc_exit(0);
> + }

We could use PgBackendStatus->st_backendType for these, merging these
different paths.

I can take care of this one, if you/Peter want.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-06-07 01:12:26 Re: [BUGS] BUG #14682: row level security not work with partitioned table
Previous Message Adam Brusselback 2017-06-07 00:57:52 Re: PG10 transition tables, wCTEs and multiple operations on the same table