Re: Infinite loop in XLogPageRead() on standby

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: michael(at)paquier(dot)xyz
Cc: cyberdemn(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, thomas(dot)munro(at)gmail(dot)com
Subject: Re: Infinite loop in XLogPageRead() on standby
Date: 2024-03-01 03:37:55
Message-ID: 20240301.123755.1228356975106538264.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 01 Mar 2024 12:04:31 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> At Fri, 01 Mar 2024 10:29:12 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> > After reading this, I came up with a possibility that walreceiver
> > recovers more quickly than the calling interval to
> > WaitForWALtoBecomeAvailable(). If walreceiver disconnects after a call
> > to the function WaitForWAL...(), and then somehow recovers the
> > connection before the next call, the function doesn't notice the
> > disconnection and returns XLREAD_SUCCESS wrongly. If this assumption
> > is correct, the correct fix might be for us to return XLREAD_FAIL when
> > reconnection happens after the last call to the WaitForWAL...()
> > function.
>
> That's my stupid. The function performs reconnection by itself.

Anyway, our current policy here is to avoid record-rereads beyond
source switches. However, fixing this seems to require that source
switches cause record rereads unless some additional information is
available to know if replay LSN needs to back up.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2024-03-01 03:42:23 RE: Synchronizing slots from primary to standby
Previous Message shveta malik 2024-03-01 03:27:56 Re: Synchronizing slots from primary to standby