RE: Logical replication timeout problem

From: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: RE: Logical replication timeout problem
Date: 2022-03-03 09:18:46
Message-ID: TYAPR01MB586635CF2AB592635E54AC19F5049@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Wang,

> Attach the new patch. [suggestion by Kuroda-San]
> 1. Fix the typo.
> 2. Improve comment style.
> 3. Fix missing consideration.
> 4. Add comments to clarifies above functions and function calls.

Thank you for updating the patch! I confirmed they were fixed.

```
case REORDER_BUFFER_CHANGE_INVALIDATION:
- /* Execute the invalidation messages locally */
- ReorderBufferExecuteInvalidations(
- change->data.inval.ninvalidations,
- change->data.inval.invalidations);
- break;
+ {
+ LogicalDecodingContext *ctx = rb->private_data;
+
+ Assert(!ctx->fast_forward);
+
+ /* Set output state. */
+ ctx->accept_writes = true;
+ ctx->write_xid = txn->xid;
+ ctx->write_location = change->lsn;
```

Some codes were added in ReorderBufferProcessTXN() for treating DDL,

I'm also happy if you give the version number :-).

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

> -----Original Message-----
> From: Wang, Wei/王 威 <wangw(dot)fnst(at)fujitsu(dot)com>
> Sent: Wednesday, March 2, 2022 11:06 AM
> To: Kuroda, Hayato/黒田 隼人 <kuroda(dot)hayato(at)fujitsu(dot)com>
> Cc: Fabrice Chapuis <fabrice636861(at)gmail(dot)com>; Simon Riggs
> <simon(dot)riggs(at)enterprisedb(dot)com>; Petr Jelinek
> <petr(dot)jelinek(at)enterprisedb(dot)com>; Tang, Haiying/唐 海英
> <tanghy(dot)fnst(at)fujitsu(dot)com>; Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>;
> PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>; Ajin Cherian
> <itsajin(at)gmail(dot)com>
> Subject: RE: Logical replication timeout problem
>
> On Mon, Feb 28, 2022 at 6:58 PM Kuroda, Hayato/黒田 隼人
> <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > Dear Wang,
> >
> > > Attached a new patch that addresses following improvements I have got
> > > so far as
> > > comments:
> > > 1. Consider other changes that need to be skipped(truncate, DDL and
> > > function calls pg_logical_emit_message). [suggestion by Ajin, Amit]
> > > (Add a new function SendKeepaliveIfNecessary for trying to send
> > > keepalive
> > > message.)
> > > 2. Set the threshold conservatively to a static value of
> > > 10000.[suggestion by Amit, Kuroda-San] 3. Reset sendTime in function
> > > WalSndUpdateProgress when send_keep_alive is false. [suggestion by
> > > Amit]
> >
> > Thank you for giving a good patch! I'll check more detail later, but it can be
> > applied my codes and passed check world.
> > I put some minor comments:
> Thanks for your comments.
>
> > ```
> > + * Try to send keepalive message
> > ```
> >
> > Maybe missing "a"?
> Fixed. Add missing "a".
>
> > ```
> > + /*
> > + * After continuously skipping SKIPPED_CHANGES_THRESHOLD
> changes, try
> > to send a
> > + * keepalive message.
> > + */
> > ```
> >
> > This comments does not follow preferred style:
> > https://www.postgresql.org/docs/devel/source-format.html
> Fixed. Correct wrong comment style.
>
> > ```
> > @@ -683,12 +683,12 @@ OutputPluginWrite(struct LogicalDecodingContext
> *ctx,
> > bool last_write)
> > * Update progress tracking (if supported).
> > */
> > void
> > -OutputPluginUpdateProgress(struct LogicalDecodingContext *ctx)
> > +OutputPluginUpdateProgress(struct LogicalDecodingContext *ctx, bool
> > +send_keep_alive)
> > ```
> >
> > This function is no longer doing just tracking.
> > Could you update the code comment above?
> Fixed. Update the comment above function OutputPluginUpdateProgress.
>
> > ```
> > if (!is_publishable_relation(relation))
> > return;
> > ```
> >
> > I'm not sure but it seems that the function exits immediately if relation is a
> > sequence, view, temporary table and so on. Is it OK? Does it never happen?
> I did some checks to confirm this. After my confirmation, there are several
> situations that can cause a timeout. For example, if I insert many date into
> table sql_features in a long transaction, subscriber-side will time out.
> Although I think users should not modify these tables arbitrarily, it could
> happen. To be conservative, I think this use case should be addressed as well.
> Fixed. Invoke function SendKeepaliveIfNecessary before return.
>
> > ```
> > + SendKeepaliveIfNecessary(ctx, false);
> > ```
> >
> > I think a comment is needed above which clarifies sending a keepalive
> message.
> Fixed. Before invoking function SendKeepaliveIfNecessary, add the
> corresponding
> comment.
>
> Attach the new patch. [suggestion by Kuroda-San]
> 1. Fix the typo.
> 2. Improve comment style.
> 3. Fix missing consideration.
> 4. Add comments to clarifies above functions and function calls.
>
> Regards,
> Wang wei

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jille Timmermans 2022-03-03 09:21:05 Re: Support for grabbing multiple consecutive values with nextval()
Previous Message Yugo NAGATA 2022-03-03 09:11:44 Re: Commitfest 2022-03 Patch Triage Part 1a.i