Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Henry Hinze <henry(dot)hinze(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
Date: 2020-11-17 08:58:53
Message-ID: CAA4eK1J3+4wT_biGP7ud=JFq-KUeH-BF5AwY_Xgay9AsGMAPQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Nov 17, 2020 at 7:44 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Tue, Nov 17, 2020 at 1:07 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > Yeah, this seems to be possible and this is the reason I mentioned
> > above to dig more into this case. Did you try it via some test case? I
> > think we can generate a test via debugger where after the tablesync
> > worker reaches CATCHUP state stop it via debugger, then we can perform
> > some large transaction on the same table which apply worker will skip
> > and tablesync worker will try to apply changes and should fail.
>
> Hello Amit.
>
> FYI - This email is just to confirm that your above idea for debugging
> the tablesync worker does work.
>

Thanks for trying this out.

> ---
>
> I have so far only been trying above with the non-streaming
> subscription, and TBH stepping through this startup state "dance" of
> the tablesync/apply workers is already causing more questions than
> answers for me. Anyway, I will raise any questions as separate emails
> to this one.
>
> BTW, do you think these tablesync discussions should be moved to
> hackers list?
>

Sure. I think it is better to start a new thread for the streaming
issue we have suspected here and possible ways to fix the same. I
guess you have some other observations as well which you might want to
discuss separately.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Devrim Gündüz 2020-11-17 16:13:27 Re: BUG #16721: ERROR: could not load library "/usr/pgsql-11/lib/rtpostgis-2.5.so": /usr/gdal32/lib/libgdal.so.28:
Previous Message Peter Smith 2020-11-17 02:14:23 Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop