Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-11-16 12:03:54
Message-ID: CAJpy0uB_y_1akXUWM9WnSdQ-NVWTnHTpmNu6gJQ5tnyuAMTUOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 16, 2023 at 3:43 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Nov 15, 2023 at 5:21 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > PFA v34.
> >
>
> Few comments on v34-0001*
> =======================
> 1.
> + char buf[100];
> +
> + buf[0] = '\0';
> +
> + if (MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING)
> + strcat(buf, "twophase");
> + if (MySubscription->failoverstate == LOGICALREP_FAILOVER_STATE_PENDING)
> + {
> + if (buf[0] != '\0')
> + strcat(buf, " and ");
> + strcat(buf, "failover");
> + }
> +
> ereport(LOG,
> - (errmsg("logical replication apply worker for subscription \"%s\"
> will restart so that two_phase can be enabled",
> - MySubscription->name)));
> + (errmsg("logical replication apply worker for subscription \"%s\"
> will restart so that %s can be enabled",
> + MySubscription->name, buf)));
>
> I feel it is better to separate elogs rather than construct the
> string. It would be easier for the translation.
>
> 2.
> -
> /* Initialize walsender process before entering the main command loop */
>
> Spurious line removal
>
> 3.
> @@ -440,17 +448,8 @@ pg_physical_replication_slot_advance(XLogRecPtr moveto)
>
> if (startlsn < moveto)
> {
> - SpinLockAcquire(&MyReplicationSlot->mutex);
> - MyReplicationSlot->data.restart_lsn = moveto;
> - SpinLockRelease(&MyReplicationSlot->mutex);
> + PhysicalConfirmReceivedLocation(moveto);
> retlsn = moveto;
> -
> - /*
> - * Dirty the slot so as it is written out at the next checkpoint. Note
> - * that the LSN position advanced may still be lost in the event of a
> - * crash, but this makes the data consistent after a clean shutdown.
> - */
> - ReplicationSlotMarkDirty();
> }
>
> I think this change has been made so that we can wakeup logical
> walsenders from a central location. In general, this is a good idea
> but it seems calling PhysicalConfirmReceivedLocation() would make an
> additional call to ReplicationSlotsComputeRequiredLSN() which is
> already called in the caller of
> pg_physical_replication_slot_advance(), so not sure such unification
> is a good idea here.
>
> 4.
> + * Here logical walsender associated with failover logical slot waits
> + * for physical standbys corresponding to physical slots specified in
> + * standby_slot_names GUC.
> + */
> +void
> +WalSndWaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
>
> In the above comments, we don't seem to follow the 80-col limit.
> Please check all other comments in the patch for similar problem.
>
> 5.
> +static void
> +WalSndRereadConfigAndSlots(List **standby_slots)
> +{
> + char *pre_standby_slot_names = pstrdup(standby_slot_names);
> +
> + ProcessConfigFile(PGC_SIGHUP);
> +
> + if (strcmp(pre_standby_slot_names, standby_slot_names) != 0)
> + {
> + list_free(*standby_slots);
> + *standby_slots = GetStandbySlotList(true);
> + }
> +
> + pfree(pre_standby_slot_names);
> +}
>
> The function name is misleading w.r.t the functionality. Can we name
> it on the lines of WalSndRereadConfigAndReInitSlotList()? I know it is
> a bit longer but couldn't come up with anything better.
>
> 6.
> + /*
> + * Fast path to entering the loop in case we already know we have
> + * enough WAL available and all the standby servers has confirmed
> + * receipt of WAL upto RecentFlushPtr.
>
> I think this comment is a bit misleading because it is a fast path to
> avoid entering the loop. I think we can keep the existing comment
> here: "Fast path to avoid acquiring the spinlock in case we already
> know ..."
>
> 7.
> @@ -3381,7 +3673,9 @@ WalSndWait(uint32 socket_events, long timeout,
> uint32 wait_event)
> * And, we use separate shared memory CVs for physical and logical
> * walsenders for selective wake ups, see WalSndWakeup() for more details.
> */
> - if (MyWalSnd->kind == REPLICATION_KIND_PHYSICAL)
> + if (wait_for_standby)
> + ConditionVariablePrepareToSleep(&WalSndCtl->wal_confirm_rcv_cv);
> + else if (MyWalSnd->kind == REPLICATION_KIND_PHYSICAL)
>
> The comment above this change needs to be updated for the usage of this new CV.
>
> 8.
> +WAL_SENDER_WAIT_FOR_STANDBY_CONFIRMATION "Waiting for physical
> standby confirmation in WAL sender process."
>
> I feel the above description is not clear. How about being more
> specific with something along the lines of: "Waiting for the WAL to be
> received by physical standby in WAL sender process."
>
> 9.
> + {"standby_slot_names", PGC_SIGHUP, REPLICATION_PRIMARY,
> + gettext_noop("List of streaming replication standby server slot "
> + "names that logical walsenders waits for."),
>
> I think we slightly simplify it by saying: "Lists streaming
> replication standby server slot names that logical WAL sender
> processes wait for.". It would be more consistent with a few other
> similar variables.
>
> 10.
> + gettext_noop("List of streaming replication standby server slot "
> + "names that logical walsenders waits for."),
> + gettext_noop("Decoded changes are sent out to plugins by logical "
> + "walsenders only after specified replication slots "
> + "confirm receiving WAL."),
>
> Instead of walsenders, let's use WAL sender processes.
>
> 11.
> @@ -6622,10 +6623,12 @@ describeSubscriptions(const char *pattern, bool verbose)
> appendPQExpBuffer(&buf,
> ", suborigin AS \"%s\"\n"
> ", subpasswordrequired AS \"%s\"\n"
> - ", subrunasowner AS \"%s\"\n",
> + ", subrunasowner AS \"%s\"\n"
> + ", subfailoverstate AS \"%s\"\n",
> gettext_noop("Origin"),
> gettext_noop("Password required"),
> - gettext_noop("Run as owner?"));
> + gettext_noop("Run as owner?"),
> + gettext_noop("Enable failover?"));
>
> Let's name the new column as "Failover" and also it should be
> displayed only when pset.sversion is >=17.
>
> 12.
> @@ -93,6 +97,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
> BKI_SHARED_RELATION BKI_ROW
> bool subrunasowner; /* True if replication should execute as the
> * subscription owner */
>
> + char subfailoverstate; /* Enable Failover State */
>
> This should be listed in system_views.sql in the below GRANT statement:
> GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
> subbinary, substream, subtwophasestate, subdisableonerr,
> subpasswordrequired, subrunasowner,
> subslotname, subsynccommit, subpublications, suborigin)
>
> 13.
> + ConditionVariable wal_confirm_rcv_cv;
> +
> WalSnd walsnds[FLEXIBLE_ARRAY_MEMBER];
> } WalSndCtlData;
>
> It is better to add a comment for this new variable explaining its use.
>
> --
> With Regards,
> Amit Kapila.

PFA v35. It has below changes:

1) change of default for 'enable_syncslot' to false.
2) validate the dbname provided in primary_conninfo before attempting
slot-sync.
3) do not allow logical decoding on slots with 'i' sync_state.
4) support in pg_upgrade for the failover property of slot.
5) do not start slot-sync if wal_level < logical
6) shutdown the slotsync worker on promotion.

Thanks Ajin for working on 4 and 5. Thanks Hou-San for working on 6.

The changes are in patch001 and patch002.

With above changes, comments in [1] and [2] are addressed

TODO:
1) Comments in [3].
2) Analyze if we need to consider supporting an upgrade of the slot's
'sync_state' property?

[1]: https://www.postgresql.org/message-id/OS0PR01MB571652CCD42F1D08D5BD69D494B3A%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2]: https://www.postgresql.org/message-id/46070646-9e09-4566-8a62-ae31a12a510c%40gmail.com
[3]: https://www.postgresql.org/message-id/CAA4eK1J%3D-kPHS1eHNBtzOQHZ64j6WSgSYQZ3fH%3D2vfiwy_48AA%40mail.gmail.com

thanks
Shveta

Attachment Content-Type Size
v35-0003-Allow-slot-sync-worker-to-wait-for-the-cascading.patch application/octet-stream 7.9 KB
v35-0001-Allow-logical-walsenders-to-wait-for-the-physica.patch application/octet-stream 130.1 KB
v35-0002-Add-logical-slot-sync-capability-to-the-physical.patch application/octet-stream 102.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2023-11-16 12:38:52 Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)
Previous Message Ashutosh Bapat 2023-11-16 11:10:48 Re: serial and partitioned table