Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-10-27 10:34:16
Message-ID: CAJpy0uDv01ctC3z7fV3cbvVw8o+micn7zkD+VBxFm4TnsQh3OQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 27, 2023 at 3:26 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Wed, Oct 25, 2023 at 3:15 PM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > On 10/25/23 5:00 AM, shveta malik wrote:
> > > On Tue, Oct 24, 2023 at 11:54 AM Drouvot, Bertrand
> > > <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> > >>
> > >> Hi,
> > >>
> > >> On 10/23/23 2:56 PM, shveta malik wrote:
> > >>> On Mon, Oct 23, 2023 at 5:52 PM Drouvot, Bertrand
> > >>> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> > >>
> > >>>> We are waiting for DEFAULT_NAPTIME_PER_CYCLE (3 minutes) before checking if there
> > >>>> is new synced slot(s) to be created on the standby. Do we want to keep this behavior
> > >>>> for V1?
> > >>>>
> > >>>
> > >>> I think for the slotsync workers case, we should reduce the naptime in
> > >>> the launcher to say 30sec and retain the default one of 3mins for
> > >>> subscription apply workers. Thoughts?
> > >>>
> > >>
> > >> Another option could be to keep DEFAULT_NAPTIME_PER_CYCLE and create a new
> > >> API on the standby that would refresh the list of sync slot at wish, thoughts?
> > >>
> > >
> > > Do you mean API to refresh list of DBIDs rather than sync-slots?
> > > As per current design, launcher gets DBID lists for all the failover
> > > slots from the primary at intervals of DEFAULT_NAPTIME_PER_CYCLE.
> >
> > I mean an API to get a newly created slot on the primary being created/synced on
> > the standby at wish.
> >
> > Also let's imagine this scenario:
> >
> > - create logical_slot1 on the primary (and don't start using it)
> >
> > Then on the standby we'll get things like:
> >
> > 2023-10-25 08:33:36.897 UTC [740298] LOG: waiting for remote slot "logical_slot1" LSN (0/C00316A0) and catalog xmin (752) to pass local slot LSN (0/C0049530) and and catalog xmin (754)
> >
> > That's expected and due to the fact that ReplicationSlotReserveWal() does set the slot
> > restart_lsn to a value < at the corresponding restart_lsn slot on the primary.
> >
> > - create logical_slot2 on the primary (and start using it)
> >
> > Then logical_slot2 won't be created/synced on the standby until there is activity on logical_slot1 on the primary
> > that would produce things like:
> > 2023-10-25 08:41:35.508 UTC [740298] LOG: wait over for remote slot "logical_slot1" as its LSN (0/C005FFD8) and catalog xmin (756) has now passed local slot LSN (0/C0049530) and catalog xmin (754)
>
>
> Slight correction to above. As soon as we start activity on
> logical_slot2, it will impact all the slots on primary, as the WALs
> are consumed by all the slots. So even if there is activity on
> logical_slot2, logical_slot1 creation on standby will be unblocked and
> it will then move to logical_slot2 creation. eg:
>
> --on standby:
> 2023-10-27 15:15:46.069 IST [696884] LOG: waiting for remote slot
> "mysubnew1_1" LSN (0/3C97970) and catalog xmin (756) to pass local
> slot LSN (0/3C979A8) and and catalog xmin (756)
>
> on primary:
> newdb1=# select now();
> now
> ----------------------------------
> 2023-10-27 15:15:51.504835+05:30
> (1 row)
>
> --activity on mysubnew1_3
> newdb1=# insert into tab1_3 values(1);
> INSERT 0 1
> newdb1=# select now();
> now
> ----------------------------------
> 2023-10-27 15:15:54.651406+05:30
>
>
> --on standby, mysubnew1_1 is unblocked.
> 2023-10-27 15:15:56.223 IST [696884] LOG: wait over for remote slot
> "mysubnew1_1" as its LSN (0/3C97A18) and catalog xmin (757) has now
> passed local slot LSN (0/3C979A8) and catalog xmin (756)
>
> My Setup:
> mysubnew1_1 -->mypubnew1_1 -->tab1_1
> mysubnew1_3 -->mypubnew1_3-->tab1_3
>
> thanks
> Shveta

PFA v26 patches. The changes are:

1) 'Failover' in the main slot is now set when the table
synchronization phase is finished. So even when failover is enabled
for a subscription, the internal failover state remains temporarily
“pending” until the initialization phase completes.

2) If the standby is down, but standby_slot_names has that slot name,
we emit a warning now while waiting for that standby.

3) Fixed bug where pg_logical_slot_get_changes was resetting failover
property of slot. Thanks Ajin for providing the fix.

4) Fixed bug where standby_slot_names_list was not initialized for
non-walsender cases making pg_logical_slot_get_changes() to proceed
w/o waiting for standbys.

5) Fixed a bug where standby_slot_names_list was freed (due to free of
per_query context in non-walsender cases) but was not nullified and
thus next call was using this freed pointer and was crashing.

6) Improved wait_for_primary_slot_catchup(), we now fetch
remote-conflicting(invalidation) too and abort the wait and slot
creation if the slot on primary is invalidated.

7) Slot-sync workers now wait for cascading standby's confirmation
before updating logical synced slots on first standby.

First 5 changes are in patch001, 6th one is in patch002. For 7th, I
have created a new patch (003) to separate out the additional changes
needed for cascading standbys.

==========

Open questions regarding change for pt 1 above:
a) I think we should restrict the 'alter-sub set failover' when
failover-state is currently in 'p' (pending) state i.e. table-sync is
going over. Once table-sync is over, then toggle of 'failover' should
be allowed using alter-subscription.

b) Currently I have restricted 'alter subscription.. refresh
publication with copy=true' when failover=true (on a similar line of
two-phase). The reason being, refresh with copy=true will go for
table-sync again and since failover was set in main-slot after
table-sync was done, it will need going through the same transition of
'p' to 'e' for main slot making it unsyncable for that time. Should it
be allowed?
Currently:
newdb1=# ALTER SUBSCRIPTION mysubnew1_1 REFRESH PUBLICATION WITH
(copy_data=true);
ERROR: ALTER SUBSCRIPTION ... REFRESH with copy_data is not allowed
when failover is enabled
HINT: Use ALTER SUBSCRIPTION ... REFRESH with copy_data = false, or
use DROP/CREATE SUBSCRIPTION.

Thoughts on above queries?

thanks
Shveta

Attachment Content-Type Size
v26-0003-Allow-slot-sync-workers-to-wait-for-the-cascadin.patch application/octet-stream 12.5 KB
v26-0001-Allow-logical-walsenders-to-wait-for-the-physica.patch application/octet-stream 117.8 KB
v26-0002-Add-logical-slot-sync-capability-to-physical-sta.patch application/octet-stream 109.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2023-10-27 11:32:11 partitioning and identity column
Previous Message Nazir Bilal Yavuz 2023-10-27 10:09:01 Re: gcc 12.1.0 warning