RE: Logical replication timeout problem

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: RE: Logical replication timeout problem
Date: 2022-04-28 10:01:31
Message-ID: OS0PR01MB5716F30A9F73742BC8EA95A594FD9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, April 20, 2022 3:21 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> BTW the changes in
> REL_14_v1-0001-Fix-the-logical-replication-timeout-during-large-.patch,
> adding end_xact to LogicalDecodingContext, seems good to me and it
> might be better than the approach of v17 patch from plugin developers’
> perspective? This is because they won’t need to pass true/false to
> end_xact of OutputPluginUpdateProgress(). Furthermore, if we do what
> we do in update_replication_progress() in
> OutputPluginUpdateProgress(), what plugins need to do will be just to
> call OutputPluginUpdate() in every callback and they don't need to
> have the CHANGES_THRESHOLD logic. What do you think?

Hi Sawada-san, Wang

I was looking at the patch and noticed that we moved some logic from
update_replication_progress() to OutputPluginUpdateProgress() like
your suggestion.

I have a question about this change. In the patch we added some
restriction in this function OutputPluginUpdateProgress() like below:

+ /*
+ * If we are at the end of transaction LSN, update progress tracking.
+ * Otherwise, after continuously processing CHANGES_THRESHOLD changes, we
+ * try to send a keepalive message if required.
+ */
+ if (ctx->end_xact || ++changes_count >= CHANGES_THRESHOLD)
+ {
+ ctx->update_progress(ctx, ctx->write_location, ctx->write_xid,
+ skipped_xact);
+ changes_count = 0;
+ }

After the patch, we won't be able to always invoke the update_progress() if the
caller are at the middle of transaction(e.g. end_xact = false). The bebavior of the
public function OutputPluginUpdateProgress() is changed. I am thinking is it ok to
change this at back-branches ?

Because OutputPluginUpdateProgress() is a public function for the extension
developer, just a little concerned about the behavior change here.

Besides, the check of 'end_xact' and the 'CHANGES_THRESHOLD' seems specified to
the pgoutput. I am not very sure that if plugin author also needs these
logic(they might want to change the strategy), so I'd like to confirm it with
you.

Best regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-04-28 10:25:43 Re: Logical replication timeout problem
Previous Message Wilm Hoyer 2022-04-28 09:31:17 AW: BUG #17448: In Windows 10, version 1703 and later, huge_pages doesn't work.