Re: Synchronizing slots from primary to standby

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-02-09 04:26:43
Message-ID: CAHut+Pv88vp9mNxX37c_Bc5FDBsTS+dhV02Vgip9Wqwh7GBYSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for patch v81-0001.

======

1. GENERAL - ReplicationSlotInvalidationCause enum.

I was thinking that the ReplicationSlotInvalidationCause should
explicitly set RS_INVAL_NONE = 0 (it's zero anyway, but making it
explicit with a comment /* Must be zero. */. will stop it from being
changed in the future).

------
/*
* Slots can be invalidated, e.g. due to max_slot_wal_keep_size. If so, the
* 'invalidated' field is set to a value other than _NONE.
*/
typedef enum ReplicationSlotInvalidationCause
{
RS_INVAL_NONE = 0, /* Must be zero. */
...
} ReplicationSlotInvalidationCause;
------

The reason to do this is because many places in the patch check for
RS_INVAL_NONE, but if RS_INVAL_NONE == 0 is assured, all those code
fragments can be simplified and IMO also become more readable.

e.g. update_local_synced_slot()

BEFORE
Assert(slot->data.invalidated == RS_INVAL_NONE);

AFTER
Assert(!slot->data.invalidated);

~

e.g. local_sync_slot_required()

BEFORE
locally_invalidated =
(remote_slot->invalidated == RS_INVAL_NONE) &&
(local_slot->data.invalidated != RS_INVAL_NONE);

AFTER
locally_invalidated = !remote_slot->invalidated && local_slot->data.invalidated;

~

e.g. synchronize_one_slot()

BEFORE
if (slot->data.invalidated == RS_INVAL_NONE &&
remote_slot->invalidated != RS_INVAL_NONE)

AFTER
if (!slot->data.invalidated && remote_slot->invalidated;

BEFORE
/* Skip the sync of an invalidated slot */
if (slot->data.invalidated != RS_INVAL_NONE)

AFTER
/* Skip the sync of an invalidated slot */
if (slot->data.invalidated)

BEFORE
/* Skip creating the local slot if remote_slot is invalidated already */
if (remote_slot->invalidated != RS_INVAL_NONE)

AFTER
/* Skip creating the local slot if remote_slot is invalidated already */
if (remote_slot->invalidated)

~

e.g. synchronize_slots()

BEFORE
if ((XLogRecPtrIsInvalid(remote_slot->restart_lsn) ||
XLogRecPtrIsInvalid(remote_slot->confirmed_lsn) ||
!TransactionIdIsValid(remote_slot->catalog_xmin)) &&
remote_slot->invalidated == RS_INVAL_NONE)

AFTER
if ((XLogRecPtrIsInvalid(remote_slot->restart_lsn) ||
XLogRecPtrIsInvalid(remote_slot->confirmed_lsn) ||
!TransactionIdIsValid(remote_slot->catalog_xmin)) &&
!remote_slot->invalidated)

======
src/backend/replication/logical/slotsync.c

2. update_local_synced_slot

+ if (strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+ remote_dbid == slot->data.database &&
+ !xmin_changed && !restart_lsn_changed &&
+ remote_slot->two_phase == slot->data.two_phase &&
+ remote_slot->failover == slot->data.failover &&
+ remote_slot->confirmed_lsn == slot->data.confirmed_flush)
+ return false;

Consider rearranging the conditions to put the strcmp later -- e.g.
might as well avoid the (more expensive?) strcmp if some of those
boolean tests are already false.

~~~

3.
+ /*
+ * There is a possibility of parallel database drop by startup
+ * process and re-creation of new slot by user in the small window
+ * between getting the slot to drop and locking the db. This new
+ * user-created slot may end up using the same shared memory as
+ * that of 'local_slot'. Thus check if local_slot is still the
+ * synced one before performing actual drop.
+ */

BEFORE
There is a possibility of parallel database drop by startup process
and re-creation of new slot by user in the small window between
getting the slot to drop and locking the db.

SUGGESTION
In the small window between getting the slot to drop and locking the
database, there is a possibility of a parallel database drop by the
startup process or the creation of a new slot by the user.

~~~

4.
+/*
+ * Synchronize single slot to given position.
+ *
+ * This creates a new slot if there is no existing one and updates the
+ * metadata of the slot as per the data received from the primary server.
+ *
+ * The slot is created as a temporary slot and stays in the same
state until the
+ * the remote_slot catches up with locally reserved position and local slot is
+ * updated. The slot is then persisted and is considered as sync-ready for
+ * periodic syncs.
+ */

/Synchronize single slot to given position./Synchronize a single slot
to the given position./

~~~

5. synchronize_slots

+ /*
+ * The primary_slot_name is not set yet or WALs not received yet.
+ * Synchronization is not possible if the walreceiver is not started.
+ */
+ latestWalEnd = GetWalRcvLatestWalEnd();
+ SpinLockAcquire(&WalRcv->mutex);
+ if ((WalRcv->slotname[0] == '\0') ||
+ XLogRecPtrIsInvalid(latestWalEnd))
+ {
+ SpinLockRelease(&WalRcv->mutex);
+ return;
+ }
+ SpinLockRelease(&WalRcv->mutex);

The comment talks about the GUC "primary_slot_name", but the code is
checking the WalRcv's slotname. It may be the same, but the difference
is confusing.

~~~

6.
+ /*
+ * If restart_lsn, confirmed_lsn or catalog_xmin is invalid but slot
+ * is not invalidated, that means we have fetched the remote_slot in
+ * its RS_EPHEMERAL state itself. In such a case, avoid syncing it
+ * yet. We can always sync it in the next sync cycle when the
+ * remote_slot is persisted and has valid lsn(s) and xmin values.
+ *
+ * XXX: In future, if we plan to expose 'slot->data.persistency' in
+ * pg_replication_slots view, then we can avoid fetching RS_EPHEMERAL
+ * slots in the first place.
+ */

SUGGESTION (1st para)
If restart_lsn, confirmed_lsn or catalog_xmin is invalid but the slot
is valid, that means we have fetched the remote_slot in its
RS_EPHEMERAL state. In such a case, don't sync it; we can always sync
it in the next ...

~~~

7.
+ /*
+ * Use shared lock to prevent a conflict with
+ * ReplicationSlotsDropDBSlots(), trying to drop the same slot during
+ * a drop-database operation.
+ */
+ LockSharedObject(DatabaseRelationId, remote_dbid, 0, AccessShareLock);
+
+ synchronize_one_slot(remote_slot, remote_dbid);
+
+ UnlockSharedObject(DatabaseRelationId, remote_dbid, 0, AccessShareLock);

IMO remove the blank lines (e.g., you don't use this kind of
formatting for spin locks)

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2024-02-09 04:30:04 RE: Synchronizing slots from primary to standby
Previous Message Michael Paquier 2024-02-09 04:21:34 Re: Make COPY format extendable: Extract COPY TO format implementations