Re: Logical replication timeout problem

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: Re: Logical replication timeout problem
Date: 2022-03-17 11:51:45
Message-ID: CAD21AoBHOexAtPh71ckBtPBwuNQUbsi8eQSRmiz4cGunszDqxg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 17, 2022 at 7:14 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Mar 17, 2022 at 12:27 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, Mar 16, 2022 at 7:38 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > After more thought, can we check only wal_sender_timeout without
> > > skip-count? That is, in WalSndUpdateProgress(), if we have received
> > > any reply from the subscriber in last (wal_sender_timeout / 2), we
> > > don't need to do anything in terms of keep-alive. If not, we do
> > > ProcessRepliesIfAny() (and probably WalSndCheckTimeOut()?) then
> > > WalSndKeepalivesIfNecessary(). That way, we can send keep-alive
> > > messages every (wal_sender_timeout / 2). And since we don't call them
> > > for every change, we would not need to worry about the overhead much.
> > >
> >
> > But won't that lead to a call to GetCurrentTimestamp() for each change
> > we skip? IIUC from previous replies that lead to a slight slowdown in
> > previous tests of Wang-San.
> >
> If the above is true then I think we can use a lower skip_count say 10
> along with a timeout mechanism to send keepalive message. This will
> help us to alleviate the overhead Wang-San has shown.

Using both sounds reasonable to me. I'd like to see how much the
overhead is alleviated by using skip_count 10 (or 100).

> BTW, I think there could be one other advantage of using
> ProcessRepliesIfAny() (as you are suggesting) is that it can help to
> release sync waiters if there are any. I feel that would be the case
> for the skip_empty_transactions patch [1] which uses
> WalSndUpdateProgress to send keepalive messages after skipping empty
> transactions.

+1

Regards,

[1] https://www.postgresql.org/message-id/OS3PR01MB6275DFFDAC7A59FA148931529E209%40OS3PR01MB6275.jpnprd01.prod.outlook.com

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shinoda, Noriyoshi (PN Japan FSIP) 2022-03-17 12:01:14 RE: ICU for global collation
Previous Message Amit Kapila 2022-03-17 11:43:11 Re: logical replication empty transactions