Re: Logical replication timeout problem

From: Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Logical replication timeout problem
Date: 2022-01-28 11:35:30
Message-ID: CAA5-nLB5pXsoXBdf9WPB8b04=nWPp=i4nm9X==S18SQ3Di8Xhg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for your new fix Wang.

TimestampTz ping_time = TimestampTzPlusMilliseconds(sendTime,
wal_sender_timeout / 2);

shouldn't we use receiver_timeout in place of wal_sender_timeout because de
problem comes from the consummer.

On Wed, Jan 26, 2022 at 4:37 AM wangw(dot)fnst(at)fujitsu(dot)com <
wangw(dot)fnst(at)fujitsu(dot)com> wrote:

> On Thu, Jan 22, 2022 at 7:12 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> > Now, one idea to solve this problem could be that whenever we skip
> > sending any change we do try to update the plugin progress via
> > OutputPluginUpdateProgress(for walsender, it will invoke
> > WalSndUpdateProgress), and there it tries to process replies and send
> > keep_alive if necessary as we do when we send some data via
> > OutputPluginWrite(for walsender, it will invoke WalSndWriteData). I
> > don't know whether it is a good idea to invoke such a mechanism for
> > every change we skip to send or we should do it after we skip sending
> > some threshold of continuous changes. I think later would be
> > preferred. Also, we might want to introduce a new parameter
> > send_keep_alive to this API so that there is flexibility to invoke
> > this mechanism as we don't need to invoke it while we are actually
> > sending data and before that, we just update the progress via this
> > API.
>
> I tried out the patch according to your advice.
> I found if I invoke ProcessRepliesIfAny and WalSndKeepaliveIfNecessary in
> function OutputPluginUpdateProgress, the running time of the newly added
> function OutputPluginUpdateProgress invoked in pgoutput_change brings
> notable
> overhead:
> --11.34%--pgoutput_change
> |
> |--8.94%--OutputPluginUpdateProgress
> | |
> | --8.70%--WalSndUpdateProgress
> | |
> | |--7.44%--ProcessRepliesIfAny
>
> So I tried another way of sending keepalive message to the standby machine
> based on the timeout without asking for a reply(see attachment), the
> running
> time of the newly added function OutputPluginUpdateProgress invoked in
> pgoutput_change also brings slight overhead:
> --3.63%--pgoutput_change
> |
> |--1.40%--get_rel_sync_entry
> | |
> | --1.14%--hash_search
> |
> --1.08%--OutputPluginUpdateProgress
> |
> --0.85%--WalSndUpdateProgress
>
> Based on above, I think the second idea that sending some threshold of
> continuous changes might be better, I will do some research about this
> approach.
>
> Regards,
> Wang wei
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Borisov 2022-01-28 12:56:11 Re: UNIQUE null treatment option
Previous Message Michael Paquier 2022-01-28 11:23:57 Re: BeginCopyTo - remove switching to old memory context in between COPY TO command processing