RE: Synchronizing slots from primary to standby

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: RE: Synchronizing slots from primary to standby
Date: 2024-03-02 03:51:18
Message-ID: OS0PR01MB5716B48653125E199A0901D3945D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, March 1, 2024 12:23 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are some review comments for v102-0001.
>
> ======
> doc/src/sgml/config.sgml
>
> 1.
> + <para>
> + Lists the streaming replication standby server slot names that logical
> + WAL sender processes will wait for. Logical WAL sender processes will
> + send decoded changes to plugins only after the specified replication
> + slots confirm receiving WAL. This guarantees that logical replication
> + slots with failover enabled do not consume changes until those
> changes
> + are received and flushed to corresponding physical standbys. If a
> + logical replication connection is meant to switch to a physical standby
> + after the standby is promoted, the physical replication slot for the
> + standby should be listed here. Note that logical replication will not
> + proceed if the slots specified in the standby_slot_names do
> not exist or
> + are invalidated.
> + </para>
>
> Should this also mention the effect this GUC has on those 2 SQL functions? E.g.
> Commit message says:
>
> Additionally, The SQL functions pg_logical_slot_get_changes and
> pg_replication_slot_advance are modified to wait for the replication slots
> mentioned in 'standby_slot_names' to catch up before returning.

Added.

>
> ======
> src/backend/replication/slot.c
>
> 2. validate_standby_slots
>
> + else if (!ReplicationSlotCtl)
> + {
> + /*
> + * We cannot validate the replication slot if the replication slots'
> + * data has not been initialized. This is ok as we will validate the
> + * specified slot when waiting for them to catch up. See
> + * StandbySlotsHaveCaughtup for details.
> + */
> + }
> + else
> + {
> + /*
> + * If the replication slots' data have been initialized, verify if the
> + * specified slots exist and are logical slots.
> + */
> + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
>
> IMO that 2nd comment does not need to say "If the replication slots'
> data have been initialized," because that is implicit from the if/else.

Removed.

>
> ~~~
>
> 3. GetStandbySlotList
>
> +List *
> +GetStandbySlotList(void)
> +{
> + if (RecoveryInProgress())
> + return NIL;
> + else
> + return standby_slot_names_list;
> +}
>
> The 'else' is not needed. IMO is better without it, but YMMV.

Removed.

>
> ~~~
>
> 4. StandbySlotsHaveCaughtup
>
> +/*
> + * Return true if the slots specified in standby_slot_names have caught
> +up to
> + * the given WAL location, false otherwise.
> + *
> + * The elevel parameter determines the error level used for logging
> +messages
> + * related to slots that do not exist, are invalidated, or are inactive.
> + */
> +bool
> +StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel)
>
> /determines/specifies/

Changed.

>
> ~
>
> 5.
> + /*
> + * Don't need to wait for the standby to catch up if there is no value
> + in
> + * standby_slot_names.
> + */
> + if (!standby_slot_names_list)
> + return true;
> +
> + /*
> + * If we are on a standby server, we do not need to wait for the
> + standby to
> + * catch up since we do not support syncing slots to cascading standbys.
> + */
> + if (RecoveryInProgress())
> + return true;
> +
> + /*
> + * Return true if all the standbys have already caught up to the passed
> + in
> + * WAL localtion.
> + */
> + if (!XLogRecPtrIsInvalid(standby_slot_oldest_flush_lsn) &&
> + standby_slot_oldest_flush_lsn >= wait_for_lsn) return true;
>
>
> 5a.
> I felt all these comments should be worded in a consistent way like "Don't need
> to wait ..."
>
> e.g.
> 1. Don't need to wait for the standbys to catch up if there is no value in
> 'standby_slot_names'.
> 2. Don't need to wait for the standbys to catch up if we are on a standby server,
> since we do not support syncing slots to cascading standbys.
> 3. Don't need to wait for the standbys to catch up if they are already beyond
> the specified WAL location.

Changed.

>
> ~
>
> 5b.
> typo
> /WAL localtion/WAL location/
>

Fixed.

> ~~~
>
> 6.
> + if (!slot)
> + {
> + /*
> + * It may happen that the slot specified in standby_slot_names GUC
> + * value is dropped, so let's skip over it.
> + */
> + ereport(elevel,
> + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("replication slot \"%s\" specified in parameter %s does not exist",
> + name, "standby_slot_names"));
> + continue;
> + }
>
> Is "is dropped" strictly the only reason? IIUC another reason is the slot maybe
> just did not even exist in the first place but it was not detected before now
> because inititial validation was skipped.

Changed the comment.

>
> ~~~
>
> 7.
> + /*
> + * Return false if not all the standbys have caught up to the specified
> + WAL
> + * location.
> + */
> + if (caught_up_slot_num != list_length(standby_slot_names_list))
> + return false;
>
> Somehow it seems complicated to have a counter of the slots as you process
> then compare that counter to the list_length to determine if one of them was
> skipped.
>
> Probably simpler just to set a 'skipped' flag set whenever you do 'continue'...
>

I prefer the current style because we need to set skipped =true in
multiple places which doesn't seem better to me.

> ======
> src/backend/replication/walsender.c
>
> 8.
> +/*
> + * Returns true if there are not enough WALs to be read, or if not all
> +standbys
> + * have caught up to the flushed position when failover_slot is true;
> + * otherwise, returns false.
> + *
> + * Set prioritize_stop to true to skip waiting for WALs if the shutdown
> +signal
> + * is received.
> + *
> + * Set failover_slot to true if the current acquired slot is a failover
> +enabled
> + * slot and we are streaming.
> + *
> + * If returning true, the function sets the appropriate wait event in
> + * wait_event; otherwise, wait_event is set to 0.
> + */
> +static bool
> +NeedToWaitForWal(XLogRecPtr target_lsn, XLogRecPtr flushed_lsn, bool
> +prioritize_stop, bool failover_slot,
> + uint32 *wait_event)
>
> 8a.
> /Set prioritize_stop to true/Specify prioritize_stop=true/
>
> /Set failover_slot to true/Specify failover_slot=true/

This function has been refactored a bit.

>
> ~
>
> 8b.
> Aren't the static function names typically snake_case?

I think the current name style is more consistent with the other functions in walsender.c.

>
> ~~~
>
> 9.
> + /*
> + * Check if we need to wait for WALs to be flushed to disk. We don't
> + need
> + * to wait for WALs after receiving the shutdown signal unless the
> + * wait_for_wal_on_stop is true.
> + */
> + if (target_lsn > flushed_lsn && !(prioritize_stop && got_STOPPING))
> + *wait_event = WAIT_EVENT_WAL_SENDER_WAIT_FOR_WAL;
>
> The comment says 'wait_for_wal_on_stop' but the code says 'prioritize_stop'
> (??)

This has been removed in last version.

> ~~~
>
> 10.
> + /*
> + * Check if we need to wait for WALs to be flushed to disk. We don't
> + need
> + * to wait for WALs after receiving the shutdown signal unless the
> + * wait_for_wal_on_stop is true.
> + */
> + if (target_lsn > flushed_lsn && !(prioritize_stop && got_STOPPING))
> + *wait_event = WAIT_EVENT_WAL_SENDER_WAIT_FOR_WAL;
> +
> + /*
> + * Check if the standby slots have caught up to the flushed position.
> + It is
> + * good to wait up to RecentFlushPtr and then let it send the changes
> + to
> + * logical subscribers one by one which are already covered in
> + * RecentFlushPtr without needing to wait on every change for standby
> + * confirmation. Note that after receiving the shutdown signal, an
> + ERROR is
> + * reported if any slots are dropped, invalidated, or inactive. This
> + * measure is taken to prevent the walsender from waiting indefinitely.
> + */
> + else if (failover_slot && !StandbySlotsHaveCaughtup(flushed_lsn,
> + elevel)) *wait_event = WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION;
> + else
> + return false;
> +
> + return true;
>
> This if/else/else seems overly difficult to read. IMO it will be easier if written
> like:
>
> SUGGESTION
> <comment>
> if (target_lsn > flushed_lsn && !(prioritize_stop && got_STOPPING)) {
> *wait_event = WAIT_EVENT_WAL_SENDER_WAIT_FOR_WAL;
> return true;
> }
>
> <comment>
> if (failover_slot && !StandbySlotsHaveCaughtup(flushed_lsn, elevel)) {
> *wait_event = WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION;
> return true;
> }
>
> return false;

Changed.

Attach the V103 patch set which addressed above comments and
Sawada-san's comment[1].

Apart from the comments, the code in WalSndWaitForWal was refactored
a bit to make it neater. Thanks Shveta for helping writing the code and doc.

[1] https://www.postgresql.org/message-id/CAD21AoBhty79zHgXYMNHH1KqO2VtmjRi22QPmYP2yaHC9WFVdw%40mail.gmail.com

Best Regards,
Hou zj

Attachment Content-Type Size
v103-0002-Document-the-steps-to-check-if-the-standby-is-r.patch application/octet-stream 7.0 KB
v103-0001-Allow-logical-walsenders-to-wait-for-the-physic.patch application/octet-stream 45.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2024-03-02 03:51:51 RE: Synchronizing slots from primary to standby
Previous Message Peter Geoghegan 2024-03-02 01:30:15 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan