RE: Logical replication timeout problem

From: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: RE: Logical replication timeout problem
Date: 2022-03-22 01:55:32
Message-ID: OS3PR01MB62756599B78FA7D4C908109A9E179@OS3PR01MB6275.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 21, 2022 at 1:31 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
Thanks for your comments.

> On Fri, Mar 18, 2022 at 4:20 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Mar 18, 2022 at 10:43 AM wangw(dot)fnst(at)fujitsu(dot)com
> > <wangw(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > On Thu, Mar 17, 2022 at 7:52 PM Masahiko Sawada
> <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > >
> > > Attach the new patch.
> > >
> >
> > *
> > case REORDER_BUFFER_CHANGE_INVALIDATION:
> > - /* Execute the invalidation messages locally */
> > - ReorderBufferExecuteInvalidations(
> > - change->data.inval.ninvalidations,
> > - change->data.inval.invalidations);
> > - break;
> > + {
> > + LogicalDecodingContext *ctx = rb->private_data;
> > +
> > + /* Try to send a keepalive message. */
> > + UpdateProgress(ctx, true);
> >
> > Calling UpdateProgress() here appears adhoc to me especially because
> > it calls OutputPluginUpdateProgress which appears to be called only
> > from plugin API. Am, I missing something? Also why the same handling
> > is missed in other similar messages like
> > REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID where we don't call
> any
> > plug-in API?
Yes, you are right.
And I invoke in case REORDER_BUFFER_CHANGE_INVALIDATION because I think every
DDL will modify the catalog then get into this case. So I only invoke function
UpdateProgress here to handle DDL.

> > I am not sure what is a good way to achieve this but one idea that
> > occurred to me was shall we invent a new callback
> > ReorderBufferSkipChangeCB similar to ReorderBufferApplyChangeCB and
> > then pgoutput can register its API where we can have the logic similar
> > to what you have in UpdateProgress()? If we do so, then all the
> > cuurent callers of UpdateProgress in pgoutput can also call that API.
> > What do you think?
> >
> Another idea could be that we leave the DDL case for now as anyway
> there is very less chance of timeout for skipping DDLs and we may
> later need to even backpatch this bug-fix which would be another
> reason to not make such invasive changes. We can handle the DDL case
> if required separately.
Yes, I think a new callback function would be nice.
Yes, as you said, maybe we could fix the usecase that found the problem in the
first place. Then make further modifications on the master branch.
Modify the patch. Currently only DML related code remains.

> > * Why don't you have a quick exit like below code in WalSndWriteData?
> > /* Try taking fast path unless we get too close to walsender timeout. */ if (now
> > < TimestampTzPlusMilliseconds(last_reply_timestamp,
> > wal_sender_timeout / 2) &&
> > !pq_is_send_pending())
> > {
> > return;
> > }
Fixed. I missed this so adding it in the new patch.

> > * Can we rename variable 'is_send' to 'change_sent'?
Improve the the name of this variable.(From 'is_send' to 'change_sent')

Attach the new patch. [suggestion by Amit-San.]
1. Remove DDL related code. Handle the DDL case later separately if need.
2. Fix a missing.(In function WalSndUpdateProgress)
3. Improve variable names. (From 'is_send' to 'change_sent')
4. Fix some comments.(Above and inside the function WalSndUpdateProgress.)

Regards,
Wang wei

Attachment Content-Type Size
v4-0001-Fix-the-timeout-of-subscriber-in-long-transaction.patch application/octet-stream 11.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-03-22 01:56:03 pgsql: Add ALTER SUBSCRIPTION ... SKIP.
Previous Message houzj.fnst@fujitsu.com 2022-03-22 01:55:31 RE: logical replication empty transactions