RE: Time delayed LR (WAS Re: logical replication restrictions)

From: "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "smithpb2250(at)gmail(dot)com" <smithpb2250(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2022-12-12 07:42:30
Message-ID: TYCPR01MB837344924D4D239FCF840BBAEDE29@TYCPR01MB8373.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, December 12, 2022 2:54 PM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> I asked about unexpected walsender termination caused by this feature but I
> think I didn't received an answer for it and the behavior is still exists.
>
> Specifically, if servers have the following settings, walsender terminates for
> replication timeout. After that, connection is restored after the LR delay elapses.
> Although it can be said to be working in that sense, the error happens
> repeatedly every about min_apply_delay internvals but is hard to distinguish
> from network troubles. I'm not sure you're deliberately okay with it but, I don't
> think the behavior causing replication timeouts is acceptable.
>
> > wal_sender_timeout = 10s;
> > wal_receiver_temeout = 10s;
> >
> > create subscription ... with (min_apply_delay='60s');
>
> This is a kind of artificial but timeout=60s and delay=5m is not an uncommon
> setup and that also causes this "issue".
>
> subscriber:
> > 2022-12-12 14:17:18.139 JST LOG: terminating walsender process due to
> > replication timeout
> > 2022-12-12 14:18:11.076 JST LOG: starting logical decoding for slot "s1"
> ...
Hi, Horiguchi-san

Thank you so much for your report!
Yes. Currently, how to deal with the timeout issue is under discussion.
Some analysis about the root cause are also there.

Kindly have a look at [1].

[1] - https://www.postgresql.org/message-id/TYAPR01MB58669394A67F2340B82E42D1F5E29%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Takamichi Osumi

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2022-12-12 09:33:14 Re: Checksum errors in pg_stat_database
Previous Message Hayato Kuroda (Fujitsu) 2022-12-12 07:34:49 RE: Time delayed LR (WAS Re: logical replication restrictions)