Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-08-23 05:38:02
Message-ID: CAJpy0uD=DevMxTwFVsk_=xHqYNH8heptwgW6AimQ9fbRmx4ioQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 17, 2023 at 11:55 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Thu, Aug 17, 2023 at 11:44 AM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > On 8/14/23 11:52 AM, shveta malik wrote:
> >
> > >
> > > We (myself and Ajin) performed the tests to compute the lag in standby
> > > slots as compared to primary slots with different number of slot-sync
> > > workers configured.
> > >
> >
> > Thanks!
> >
> > > 3 DBs were created, each with 30 tables and each table having one
> > > logical-pub/sub configured. So this made a total of 90 logical
> > > replication slots to be synced. Then the workload was run for aprox 10
> > > mins. During this workload, at regular intervals, primary and standby
> > > slots' lsns were captured (from pg_replication_slots) and compared. At
> > > each capture, the intent was to know how much is each standby's slot
> > > lagging behind corresponding primary's slot by taking the distance
> > > between confirmed_flush_lsn of primary and standby slot. Then we took
> > > the average (integer value) of this distance over the span of 10 min
> > > workload
> >
> > Thanks for the explanations, make sense to me.
> >
> > > and this is what we got:
> > >
> > > With max_slot_sync_workers=1, average-lag = 42290.3563
> > > With max_slot_sync_workers=2, average-lag = 24585.1421
> > > With max_slot_sync_workers=3, average-lag = 14964.9215
> > >
> > > This shows that more workers have better chances to keep logical
> > > replication slots in sync for this case.
> > >
> >
> > Agree.
> >
> > > Another statistics if it interests you is, we ran a frequency test as
> > > well (this by changing code, unit test sort of) to figure out the
> > > 'total number of times synchronization done' with different number of
> > > sync-slots workers configured. Same 3 DBs setup with each DB having 30
> > > logical replication slots. With 'max_slot_sync_workers' set at 1, 2
> > > and 3; total number of times synchronization done was 15874, 20205 and
> > > 23414 respectively. Note: this is not on the same machine where we
> > > captured lsn-gap data, it is on a little less efficient machine but
> > > gives almost the same picture
> > >
> > > Next we are planning to capture this data for a lesser number of slots
> > > like 10,30,50 etc. It may happen that the benefit of multi-workers
> > > over single workers in such cases could be less, but let's have the
> > > data to verify that.
> > >
> >
> > Thanks a lot for those numbers and for the testing!
> >
> > Do you think it would make sense to also get the number of using
> > the pg_failover_slots module? (and compare the pg_failover_slots numbers with the
> > "one worker" case here). Idea is to check if the patch does introduce
> > some overhead as compare to pg_failover_slots.
> >
>
> Yes, definitely. We will work on that and share the numbers soon.
>

Here are the numbers for pg_failover_extension. Thank You Ajin for
performing all the tests and providing the data offline.

--------------------------------------
pg_failover_slots extension:
------------------------------------
40 slots:
default nap (60 sec): 12742133.96
10ms nap: 19984.34

90 slots:
default nap (60 sec): 10063342.72
10ms nap: 34483.82

----------------------------------------------
slot-sync-workers case (default 10ms nap for each test):
---------------------------------------------
40 slots:
1 worker: 20566.09
3 worker: 7885.80

90 slots:
1 worker: 36706.84
3 worker: 10236.63

Observations:

1) Worker=1 case is slightly behind in our case as compared to
pg_failover_extension (for the same naptime of 10ms). This is due to
the support for multi-worker design where locks and dsm come into
play. I will review this case for optimization.
2) The multi-worker case seems way better in all tests.

Few points we observed while performing the tests on pg_failover_extension:

1) It has a naptime of 60sec which is on the higher side and thus we
see huge lag in slots being synchronized. Please see default-nap
readings above. The default data of extension is not comparable to our
default case. And thus for apple to apple comparisons, we changed
naptime to 10ms for pg_failover_extension.

2) It takes a lot of time while creating-slots. Every slot creation
needs workload to be run on primary i.e. if after say 4th slot
creation, there is no activity going on primary, it waits and does not
proceed to create rest of the slots and thus we had to make sure to
perform some activity on primary in parallel to each slot creation on
standby. This happens because after each slot-creation it checks if
'remote_slot->restart_lsn < MyReplicationSlot->data.restart_lsn' and
if so, it waits for primary to catch-up. The restart_lsn for newly
created slot is set at XLOG-replay position and when standby is up to
date in terms of data (i.e. all xlog-streams are received and
replayed) and no activity is going on primary, then the restart-lsn
on standby for a newly created slot at that moment is same as
confirmed-lsn of that slot on primary. And thus in order to make it
proceed it needs restart-lsn on primary to move forward.
Does it make more sense to have a check which compares confirmed_flush
of primary with restart_lsn of standby i.e. if
'remote_slot->confirmed_flush < MyReplicationSlot->data.restart_lsn'
then only wait for primary to catch-up? This check will mean that we
need to wait only if more operations are performed on primary and
xlogs are received and replayed on standby but still slots on primary
have not been advanced and thus we need to give time to primary to
catch-up.

thanks

Shveta

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2023-08-23 05:56:56 RE: subscription/015_stream sometimes breaks
Previous Message Richard Guo 2023-08-23 05:37:46 Re: Oversight in reparameterize_path_by_child leading to executor crash