Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-12-01 09:33:33
Message-ID: CAJpy0uCohHzphuvY-yadORvK0cjJT4vXbR=Ti+ea-Xh0DBPi4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 1, 2023 at 12:47 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Fri, Dec 1, 2023 at 11:17 AM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Friday, December 1, 2023 12:51 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > >
> > > On Fri, Dec 1, 2023 at 9:40 AM Zhijie Hou (Fujitsu)
> > > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > > >
> > > > On Wednesday, November 29, 2023 5:12 PM Zhijie Hou (Fujitsu)
> > > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > > >
> > > > I was reviewing slotsync worker design and here
> > > > are few comments on 0002 patch:
> > >
> > > Thanks for reviewing the patch.
> > >
> > > >
> > > >
> > > > 3. In synchronize_one_slot, do we need to skip the slot sync and drop if the
> > > > local slot is a physical one ?
> > > >
> > >
> > > IMO, if a local slot exists which is a physical one, it will be a user
> > > created slot and in that case worker will error out on finding
> > > existing slot with same name. And the case where local slot is
> > > physical one but not user-created is not possible on standby (assuming
> > > we have correct check on primary disallowing setting 'failover'
> > > property for physical slot). Do you have some other scenario in mind,
> > > which I am missing here?
> >
> > I was thinking about the race condition when it has confirmed that the slot is
> > not a user created one and enter "sync_state == SYNCSLOT_STATE_READY" branch,
> > but at this moment, if someone uses "DROP_REPLICATION_SLOT" to drop this slot and
> > recreate another one(e.g. a physical one), then the slotsync worker will
> > overwrite the fields of this physical slot. Although this affects user created
> > logical slots in similar cases as well.
> >
>
> User can not drop the synced slots on standby. It should result in
> ERROR. Currently we emit this error in pg_drop_replication_slot(),
> same is needed in "DROP_REPLICATION_SLOT" replication cmd. I will
> change it. Thanks for raising this point. I think, after this ERROR,
> there is no need to worry about physical slots handling in
> synchronize_one_slot().
>
> > And the same is true for slotsync_drop_initiated_slots() and
> > drop_obsolete_slots(), as we don't lock the slots in the list, if user tri to
> > drop and re-create old slot concurrently, then we could drop user created slot
> > here.
> >

PFA v42. Changes:

v42-0001: addressed comments in [1]. Thanks Hou-San for working on this.

v42-0002: addressed comments in [2] and [3]

[1]: https://www.postgresql.org/message-id/CAHut%2BPsMTvrwUBtcHff0CG_j-ALSuEta8xC1R_k0kjR%2B9A6ehg%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAFPTHDb8LW4i9-nyvz%2BXVkJmmciZwYGivpH%3DaDOrDkBfHR_q9w%40mail.gmail.com
[3]: https://www.postgresql.org/message-id/OS0PR01MB571678BABEDBE830062CAB119481A%40OS0PR01MB5716.jpnprd01.prod.outlook.com

thanks
Shveta

Attachment Content-Type Size
v42-0003-Allow-slot-sync-worker-to-wait-for-the-cascading.patch application/octet-stream 7.8 KB
v42-0002-Add-logical-slot-sync-capability-to-the-physical.patch application/octet-stream 80.2 KB
v42-0001-Allow-logical-walsenders-to-wait-for-the-physica.patch application/octet-stream 136.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-12-01 09:57:33 Re: Is this a problem in GenericXLogFinish()?
Previous Message Amit Kapila 2023-12-01 09:15:02 Re: pg_upgrade and logical replication