Re: Logical replication timeout problem

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: Re: Logical replication timeout problem
Date: 2023-01-30 06:54:51
Message-ID: CAA4eK1LJETEL5zXDsviLOuWuyLSFaGufv3uP0N-XjhV2X6mU5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 30, 2023 at 10:36 AM wangw(dot)fnst(at)fujitsu(dot)com
<wangw(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Mon, Jan 30, 2023 11:37 AM Shi, Yu/侍 雨 <shiy(dot)fnst(at)cn(dot)fujitsu(dot)com> wrote:
> > On Sun, Jan 29, 2023 3:41 PM wangw(dot)fnst(at)fujitsu(dot)com
> > <wangw(dot)fnst(at)fujitsu(dot)com> wrote:
>
> Yes, I think you are right.
> Fixed this problem.
>

+ /*
+ * Trying to send keepalive message after every change has some
+ * overhead, but testing showed there is no noticeable overhead if
+ * we do it after every ~100 changes.
+ */
+#define CHANGES_THRESHOLD 100
+
+ if (++changes_count < CHANGES_THRESHOLD)
+ return;
...
+ changes_count = 0;

I think it is better to have this threshold-related code in that
caller as we have in the previous version. Also, let's modify the
comment as follows:"
It is possible that the data is not sent to downstream for a long time
either because the output plugin filtered it or there is a DDL that
generates a lot of data that is not processed by the plugin. So, in
such cases, the downstream can timeout. To avoid that we try to send a
keepalive message if required. Trying to send a keepalive message
after every change has some overhead, but testing showed there is no
noticeable overhead if we do it after every ~100 changes."

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2023-01-30 07:08:02 Re: Time delayed LR (WAS Re: logical replication restrictions)
Previous Message Amit Langote 2023-01-30 06:39:12 Re: SQL/JSON revisited