Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Henry Hinze <henry(dot)hinze(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
Date: 2020-11-18 05:49:12
Message-ID: CAHut+PvQ240_4v6E1DT6gSqtHQWCTQDD-OuqCk+BxE=_DKuxbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Nov 18, 2020 at 3:17 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > To cut a long story short, a tablesync worker CAN in fact end up
> > processing (e.g. apply_dispatch) streaming messages.
> > So the tablesync worker CAN get into the apply_handle_stream_commit.
> > And this scenario, albeit rare, will crash.
> >
>
> Thank you for reproducing this issue. Dilip, Peter, is anyone of you
> interested in writing a fix for this?

Hi Amit.

FYI - Sorry, I am away/offline for the next 5 days.

However, if this bug still remains unfixed after next Tuesday then I
can look at it then.

---

IIUC there are 2 options:
1) Disallow streaming for the tablesync worker.
2) Make streaming work for the tablesync worker.

I prefer option (a) not only because of the KISS principle, but also
because this is how the tablesync worker was previously thought to
behave anyway. I expect this fix may be like the code that Dilip
already posted [1]
[1] https://www.postgresql.org/message-id/CAFiTN-uUgKpfdbwSGnn3db3mMQAeviOhQvGWE_pC9icZF7VDKg%40mail.gmail.com

OTOH, option (b) fix may or may not be possible (I don't know), but I
have doubts that it is worthwhile to consider making a special fix for
a scenario which so far has never been reproduced outside of the
debugger.

--

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Amit Kapila 2020-11-18 08:43:13 Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
Previous Message Amit Kapila 2020-11-18 04:18:18 Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop