Re: Time delayed LR (WAS Re: logical replication restrictions)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "smithpb2250(at)gmail(dot)com" <smithpb2250(at)gmail(dot)com>
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2022-12-14 11:00:28
Message-ID: CAA4eK1K5rb4r7FJjbnq=n7nFT=fTEg4YQs1d65aryJztTrFNuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 14, 2022 at 4:16 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Horiguchi-san, Amit,
>
> > > On Tue, Dec 13, 2022 at 7:35 AM Kyotaro Horiguchi
> > > <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > > >
> > > > At Mon, 12 Dec 2022 18:10:00 +0530, Amit Kapila
> > <amit(dot)kapila16(at)gmail(dot)com> wrote in
> > > Yeah, I think ideally it will timeout but if we have a solution like
> > > during delay, we keep sending ping messages time-to-time, it should
> > > work fine. However, that needs to be verified. Do you see any reasons
> > > why that won't work?
>
> I have implemented and tested that workers wake up per wal_receiver_timeout/2
> and send keepalive. Basically it works well, but I found two problems.
> Do you have any good suggestions about them?
>
> 1)
>
> With this PoC at present, workers calculate sending intervals based on its
> wal_receiver_timeout, and it is suppressed when the parameter is set to zero.
>
> This means that there is a possibility that walsender is timeout when wal_sender_timeout
> in publisher and wal_receiver_timeout in subscriber is different.
> Supposing that wal_sender_timeout is 2min, wal_receiver_tiemout is 5min,
> and min_apply_delay is 10min. The worker on subscriber will wake up per 2.5min and
> send keepalives, but walsender exits before the message arrives to publisher.
>
> One idea to avoid that is to send the min_apply_delay subscriber option to publisher
> and compare them, but it may be not sufficient. Because XXX_timout GUC parameters
> could be modified later.
>

How about restarting the apply worker if min_apply_delay changes? Will
that be sufficient?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-12-14 11:24:53 Re: Inconsistency in reporting checkpointer stats
Previous Message Amit Kapila 2022-12-14 10:59:45 Re: Time delayed LR (WAS Re: logical replication restrictions)