Quick Links

Re: Logical replication timeout problem

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	"wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc:	Ajin Cherian <itsajin(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Logical replication timeout problem
Date:	2022-02-23 08:55:34
Message-ID:	CAA4eK1+-p_K_j=NiGGD6tCYXiJH0ypT4REX5PBKJ4AcUoF3gZQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Feb 22, 2022 at 9:17 AM wangw(dot)fnst(at)fujitsu(dot)com
<wangw(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Fri, Feb 18, 2022 at 10:51 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> > Some comments:
> Thanks for your review.
>
> > I see you only track skipped Inserts/Updates and Deletes. What about
> > DDL operations that are skipped, what about truncate.
> > What about changes made to unpublished tables? I wonder if you could
> > create a test script that only did DDL operations
> > and truncates, would this timeout happen?
> According to your suggestion, I tested with DDL and truncate.
> While testing, I ran only 20,000 DDLs and 10,000 truncations in one
> transaction.
> If I set wal_sender_timeout and wal_receiver_timeout to 30s, it will time out.
> And if I use the default values, it will not time out.
> IMHO there should not be long transactions that only contain DDL and
> truncation. I'm not quite sure, do we need to handle this kind of use case?
>

I think it is better to handle such cases as well and changes related
to unpublished tables as well. BTW, it seems Kuroda-San has also given
some comments [1] which I am not sure are addressed.

I think instead of keeping the skipping threshold w.r.t
wal_sender_timeout, we can use some conservative number like 10000 or
so which we are sure won't impact performance and won't lead to
timeouts.

*
+ /*
+ * skipped_changes_count is reset when processing changes that do not need to
+ * be skipped.
+ */
+ skipped_changes_count = 0

When the skipped_changes_count is reset, the sendTime should also be
reset. Can we reset it whenever the UpdateProgress function is called
with send_keep_alive as false?

[1] - https://www.postgresql.org/message-id/TYAPR01MB5866BD2248EF82FF432FE599F52D9%40TYAPR01MB5866.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.

In response to

RE: Logical replication timeout problem at 2022-02-22 03:47:08 from wangw.fnst@fujitsu.com

Responses

RE: Logical replication timeout problem at 2022-02-28 07:40:51 from wangw.fnst@fujitsu.com

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Joel Jacobson	2022-02-23 09:00:03	Re: List of all* PostgreSQL EXTENSIONs in the world
Previous Message	Joel Jacobson	2022-02-23 08:52:18	Re: List of all* PostgreSQL EXTENSIONs in the world