RE: Logical replication timeout problem

From: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
To: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: RE: Logical replication timeout problem
Date: 2022-03-02 02:06:17
Message-ID: OS3PR01MB62750D04D4A84A3FA27D7B099E039@OS3PR01MB6275.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 28, 2022 at 6:58 PM Kuroda, Hayato/黒田 隼人 <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> Dear Wang,
>
> > Attached a new patch that addresses following improvements I have got
> > so far as
> > comments:
> > 1. Consider other changes that need to be skipped(truncate, DDL and
> > function calls pg_logical_emit_message). [suggestion by Ajin, Amit]
> > (Add a new function SendKeepaliveIfNecessary for trying to send
> > keepalive
> > message.)
> > 2. Set the threshold conservatively to a static value of
> > 10000.[suggestion by Amit, Kuroda-San] 3. Reset sendTime in function
> > WalSndUpdateProgress when send_keep_alive is false. [suggestion by
> > Amit]
>
> Thank you for giving a good patch! I'll check more detail later, but it can be
> applied my codes and passed check world.
> I put some minor comments:
Thanks for your comments.

> ```
> + * Try to send keepalive message
> ```
>
> Maybe missing "a"?
Fixed. Add missing "a".

> ```
> + /*
> + * After continuously skipping SKIPPED_CHANGES_THRESHOLD changes, try
> to send a
> + * keepalive message.
> + */
> ```
>
> This comments does not follow preferred style:
> https://www.postgresql.org/docs/devel/source-format.html
Fixed. Correct wrong comment style.

> ```
> @@ -683,12 +683,12 @@ OutputPluginWrite(struct LogicalDecodingContext *ctx,
> bool last_write)
> * Update progress tracking (if supported).
> */
> void
> -OutputPluginUpdateProgress(struct LogicalDecodingContext *ctx)
> +OutputPluginUpdateProgress(struct LogicalDecodingContext *ctx, bool
> +send_keep_alive)
> ```
>
> This function is no longer doing just tracking.
> Could you update the code comment above?
Fixed. Update the comment above function OutputPluginUpdateProgress.

> ```
> if (!is_publishable_relation(relation))
> return;
> ```
>
> I'm not sure but it seems that the function exits immediately if relation is a
> sequence, view, temporary table and so on. Is it OK? Does it never happen?
I did some checks to confirm this. After my confirmation, there are several
situations that can cause a timeout. For example, if I insert many date into
table sql_features in a long transaction, subscriber-side will time out.
Although I think users should not modify these tables arbitrarily, it could
happen. To be conservative, I think this use case should be addressed as well.
Fixed. Invoke function SendKeepaliveIfNecessary before return.

> ```
> + SendKeepaliveIfNecessary(ctx, false);
> ```
>
> I think a comment is needed above which clarifies sending a keepalive message.
Fixed. Before invoking function SendKeepaliveIfNecessary, add the corresponding
comment.

Attach the new patch. [suggestion by Kuroda-San]
1. Fix the typo.
2. Improve comment style.
3. Fix missing consideration.
4. Add comments to clarifies above functions and function calls.

Regards,
Wang wei

Attachment Content-Type Size
0001-Fix-the-timeout-of-subscriber-in-long-transactions.patch application/octet-stream 13.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-03-02 02:12:50 Re: pg_stat_get_replication_slot and pg_stat_get_subscription_worker incorrectly marked as proretset
Previous Message Amit Kapila 2022-03-02 02:05:46 Re: Design of pg_stat_subscription_workers vs pgstats