From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Subject: | Re: Synchronizing slots from primary to standby |
Date: | 2024-01-22 06:58:30 |
Message-ID: | CAA4eK1Jhy1-bsu6vc0=Nja7aw5-EK_=101pnnuM3ATqTA8+=Sg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 19, 2024 at 3:55 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Fri, Jan 19, 2024 at 10:35 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> >
> > Thank you for updating the patch. I have some comments:
> >
> > ---
> > + latestWalEnd = GetWalRcvLatestWalEnd();
> > + if (remote_slot->confirmed_lsn > latestWalEnd)
> > + {
> > + elog(ERROR, "exiting from slot synchronization as the
> > received slot sync"
> > + " LSN %X/%X for slot \"%s\" is ahead of the
> > standby position %X/%X",
> > + LSN_FORMAT_ARGS(remote_slot->confirmed_lsn),
> > + remote_slot->name,
> > + LSN_FORMAT_ARGS(latestWalEnd));
> > + }
> >
> > IIUC GetWalRcvLatestWalEnd () returns walrcv->latestWalEnd, which is
> > typically the primary server's flush position and doesn't mean the LSN
> > where the walreceiver received/flushed up to.
>
> yes. I think it makes more sense to use something which actually tells
> flushed-position. I gave it a try by replacing GetWalRcvLatestWalEnd()
> with GetWalRcvFlushRecPtr() but I see a problem here. Lets say I have
> enabled the slot-sync feature in a running standby, in that case we
> are all good (flushedUpto is the same as actual flush-position
> indicated by LogstreamResult.Flush). But if I restart standby, then I
> observed that the startup process sets flushedUpto to some value 'x'
> (see [1]) while when the wal-receiver starts, it sets
> 'LogstreamResult.Flush' to another value (see [2]) which is always
> greater than 'x'. And we do not update flushedUpto with the
> 'LogstreamResult.Flush' value in walreceiver until we actually do an
> operation on primary. Performing a data change on primary sends WALs
> to standby which then hits XLogWalRcvFlush() and updates flushedUpto
> same as LogstreamResult.Flush. Until then we have a situation where
> slots received on standby are ahead of flushedUpto and thus slotsync
> worker keeps one erroring out. I am yet to find out why flushedUpto is
> set to a lower value than 'LogstreamResult.Flush' at the start of
> standby. Or maybe am I using the wrong function
> GetWalRcvFlushRecPtr() and should be using something else instead?
>
Can we think of using GetStandbyFlushRecPtr()? We probably need to
expose this function, if this works for the required purpose.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2024-01-22 07:00:08 | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Previous Message | Peter Eisentraut | 2024-01-22 06:55:04 | Re: speed up a logical replica setup |