Re: Synchronizing slots from primary to standby

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-03-01 09:09:23
Message-ID: CAFPTHDbERkhu3NoP+rYKwro+epng2eXyQqts+-tOYXi3J-ng=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 29, 2024 at 12:34 PM Zhijie Hou (Fujitsu) <
houzj(dot)fnst(at)fujitsu(dot)com> wrote:

> On Monday, February 26, 2024 7:52 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >
> > On Mon, Feb 26, 2024 at 7:49 AM Zhijie Hou (Fujitsu) <
> houzj(dot)fnst(at)fujitsu(dot)com>
> > wrote:
> > >
> > > Attach the V98 patch set which addressed above comments.
> > >
> >
> > Few comments:
> > =============
> > 1.
> > WalSndWaitForWal(XLogRecPtr loc)
> > {
> > int wakeEvents;
> > + bool wait_for_standby = false;
> > + uint32 wait_event;
> > + List *standby_slots = NIL;
> > static XLogRecPtr RecentFlushPtr = InvalidXLogRecPtr;
> >
> > + if (MyReplicationSlot->data.failover && replication_active)
> > + standby_slots = GetStandbySlotList(true);
> > +
> > /*
> > - * Fast path to avoid acquiring the spinlock in case we already know we
> > - * have enough WAL available. This is particularly interesting if we're
> > - * far behind.
> > + * Check if all the standby servers have confirmed receipt of WAL up to
> > + * RecentFlushPtr even when we already know we have enough WAL
> available.
> > + *
> > + * Note that we cannot directly return without checking the status of
> > + * standby servers because the standby_slot_names may have changed,
> > + which
> > + * means there could be new standby slots in the list that have not yet
> > + * caught up to the RecentFlushPtr.
> > */
> > - if (RecentFlushPtr != InvalidXLogRecPtr &&
> > - loc <= RecentFlushPtr)
> > - return RecentFlushPtr;
> > + if (!XLogRecPtrIsInvalid(RecentFlushPtr) && loc <= RecentFlushPtr) {
> > + FilterStandbySlots(RecentFlushPtr, &standby_slots);
> >
> > I think even if the slot list is not changed, we will always process
> each slot
> > mentioned in standby_slot_names once. Can't we cache the previous list of
> > slots for we have already waited for? In that case, we won't even need
> to copy
> > the list via GetStandbySlotList() unless we need to wait.
> >
> > 2.
> > WalSndWaitForWal(XLogRecPtr loc)
> > {
> > + /*
> > + * Update the standby slots that have not yet caught up to the flushed
> > + * position. It is good to wait up to RecentFlushPtr and then let it
> > + * send the changes to logical subscribers one by one which are
> > + * already covered in RecentFlushPtr without needing to wait on every
> > + * change for standby confirmation.
> > + */
> > + if (wait_for_standby)
> > + FilterStandbySlots(RecentFlushPtr, &standby_slots);
> > +
> > /* Update our idea of the currently flushed position. */
> > - if (!RecoveryInProgress())
> > + else if (!RecoveryInProgress())
> > RecentFlushPtr = GetFlushRecPtr(NULL);
> > else
> > RecentFlushPtr = GetXLogReplayRecPtr(NULL); ...
> > /*
> > * If postmaster asked us to stop, don't wait anymore.
> > *
> > * It's important to do this check after the recomputation of
> > * RecentFlushPtr, so we can send all remaining data before shutting
> > * down.
> > */
> > if (got_STOPPING)
> > break;
> >
> > I think because 'wait_for_standby' may not be set in the first or
> consecutive
> > cycles we may send the WAL to the logical subscriber before sending it
> to the
> > physical subscriber during shutdown.
>
> Here is the v101 patch set which addressed above comments.
>
> This version will cache the oldest standby slot's LSN each time we waited
> for
> them to catch up. The cached LSN is invalidated when we reload the GUC
> config.
> In the WalSndWaitForWal function, instead of traversing the entire standby
> list
> each time, we can check the cached LSN to quickly determine if the standbys
> have caught up. When a shutdown signal is received, we continue to wait
> for the
> standby slots to catch up. When waiting for the standbys to catch up after
> receiving the shutdown signal, an ERROR is reported if any slots are
> dropped,
> invalidated, or inactive. This measure is taken to prevent the walsender
> from
> waiting indefinitely.
>
>
Thanks for the patch.
I did some performance test run on PATCH v101 with synchronous_commit
turned on to check how much logical replication changes affects transaction
speed on primary compared to HEAD code. In all configurations, there is a
primary, a logical subscriber and a physical standby and the logical
subscriber is listed in the synchronous_standby_names. This means all
transactions on primary will not be committed until the logical subscriber
has confirmed receipt of this transaction.

Machine details:
Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz, 800GB RAM

My addition configuration on each instance is:
shared_buffers = 40GB
max_worker_processes = 32
max_parallel_maintenance_workers = 24
max_parallel_workers = 32
synchronous_commit = off
checkpoint_timeout = 1d
max_wal_size = 24GB
min_wal_size = 15GB
autovacuum = off

All tests are done using pgbench running for 15 minutes:
Creating tables: pgbench -p 6972 postgres -qis 2
Running benchmark: pgbench postgres -p 6972 -c 10 -j 3 -T 900 -P 5

HEAD code-
Primary with Synchronous_commit=on, physical standby with
hot_standby_feedback=off
RUN1 (TPS) RUN2 (TPS) AVERAGE (TPS)
8.226658 8.17815 8.202404

HEAD code-
Primary with Synchronous_commit=on, physical standby with
hot_standby_feedback=on
RUN1 (TPS) RUN2 (TPS) AVERAGE (TPS)
8.134901 8.229066 8.1819835 -- degradation from first config -0.25%

PATCHED code - (v101-0001)
Primary with synchronous_commit=on, physical standby with
hot_standby_feedback=on, standby_slot_names not configured, logical
subscriber not failover enabled, physical standby not configured for sync
RUN1 (TPS) RUN2 (TPS) AVERAGE (TPS)
8.18839 8.18839 8.18839-- degradation from first config *-0.17%*

PATCHED code - (v98-0001)
Synchronous_commit=on, hot_standby_feedback=on, standby_slot_names
configured to physical standby, logical subscriber failover enabled,
physical standby configured for sync
RUN1 (TPS) RUN2 (TPS) AVERAGE (TPS)
8.173062 8.068536 8.120799-- degradation from first config* -0.99%*

Overall, I do not see any significant performance degradation with the
patch and sync-slot enabled with one logical subscriber and one physical
standby.
Attaching script for my final test configuration for reference.

regards,
Ajin Cherian
Fujitsu Australia

Attachment Content-Type Size
patch_run_sync.zip application/x-zip-compressed 1.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-03-01 09:21:42 Re: Synchronizing slots from primary to standby
Previous Message Jakub Wartak 2024-03-01 08:20:30 Re: index prefetching