Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-08-23 10:07:59
Message-ID: CAJpy0uC+2agRtF3H=n-hW5JkoPfaZkjPXJr==y3_PRE04dQvhw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 17, 2023 at 4:09 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Thu, Aug 17, 2023 at 11:55 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Thu, Aug 17, 2023 at 11:44 AM Drouvot, Bertrand
> > <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> > >
> > > Hi,
> > >
> > > On 8/14/23 11:52 AM, shveta malik wrote:
> > >
> > > >
> > > > We (myself and Ajin) performed the tests to compute the lag in standby
> > > > slots as compared to primary slots with different number of slot-sync
> > > > workers configured.
> > > >
> > >
> > > Thanks!
> > >
> > > > 3 DBs were created, each with 30 tables and each table having one
> > > > logical-pub/sub configured. So this made a total of 90 logical
> > > > replication slots to be synced. Then the workload was run for aprox 10
> > > > mins. During this workload, at regular intervals, primary and standby
> > > > slots' lsns were captured (from pg_replication_slots) and compared. At
> > > > each capture, the intent was to know how much is each standby's slot
> > > > lagging behind corresponding primary's slot by taking the distance
> > > > between confirmed_flush_lsn of primary and standby slot. Then we took
> > > > the average (integer value) of this distance over the span of 10 min
> > > > workload
> > >
> > > Thanks for the explanations, make sense to me.
> > >
> > > > and this is what we got:
> > > >
> > > > With max_slot_sync_workers=1, average-lag = 42290.3563
> > > > With max_slot_sync_workers=2, average-lag = 24585.1421
> > > > With max_slot_sync_workers=3, average-lag = 14964.9215
> > > >
> > > > This shows that more workers have better chances to keep logical
> > > > replication slots in sync for this case.
> > > >
> > >
> > > Agree.
> > >
> > > > Another statistics if it interests you is, we ran a frequency test as
> > > > well (this by changing code, unit test sort of) to figure out the
> > > > 'total number of times synchronization done' with different number of
> > > > sync-slots workers configured. Same 3 DBs setup with each DB having 30
> > > > logical replication slots. With 'max_slot_sync_workers' set at 1, 2
> > > > and 3; total number of times synchronization done was 15874, 20205 and
> > > > 23414 respectively. Note: this is not on the same machine where we
> > > > captured lsn-gap data, it is on a little less efficient machine but
> > > > gives almost the same picture
> > > >
> > > > Next we are planning to capture this data for a lesser number of slots
> > > > like 10,30,50 etc. It may happen that the benefit of multi-workers
> > > > over single workers in such cases could be less, but let's have the
> > > > data to verify that.
> > > >
> > >
> > > Thanks a lot for those numbers and for the testing!
> > >
> > > Do you think it would make sense to also get the number of using
> > > the pg_failover_slots module? (and compare the pg_failover_slots numbers with the
> > > "one worker" case here). Idea is to check if the patch does introduce
> > > some overhead as compare to pg_failover_slots.
> > >
> >
> > Yes, definitely. We will work on that and share the numbers soon.
> >
>
> We are working on these tests. Meanwhile attaching the patches which
> attempt to implement below functionalities:
>
> 1) Remove extra slots on standby if those no longer exist on primary
> or are no longer part of synchronize_slot_names.
> 2) Make synchronize_slot_names user-modifiable. And due to change in
> 'synchronize_slot_names', if dbids list is reduced, then take care of
> removal of extra/old db-ids (if any) from workers db-list.
>
> Thanks Ajin for working on 1. Both the above changes are in
> patch-0002. There is a test failure in the recovery module due to
> these new changes, I am looking into it and will fix it in the next
> version.
>
> Improvements in pipeline:
> a) standby slots should not be consumable.
> b) optimization of query which standby sends to primary. Currently it
> has dbid filter and slot-name filter. Slot-name filter can be
> optimized to have only those slots which belong to DBs assigned to the
> worker rather than having all 'synchronize_slot_names'.
> c) analyze if the naptime of the slot-sync worker can be auto-tuned.
> If there is no activity going on (i.e. slots are not advancing on
> primary) then increase naptime of slot-sync worker on standby and
> decrease it again when activity starts.
>

Please find the patches attached. 0002 has below changes:

1) The naptime of the worker is now tuned as per the activity on
primary. Each worker starts with a naptime of 10ms and if no activity
is observed on primary for some time, then naptime is increased to
10sec. And if activity is observed again, naptime is reduced back to
10ms. Each worker does it by choosing one slot (first one assigned to
it) for monitoring purposes. If there is no change in lsn of that slot
for say over 5 sync-checks, naptime is increased to 10sec and as soon
as a change is observed, naptime is reduced back to 10ms.

2) The query sent by standby to primary to get slot info is written
better. The query has filters : where DBID in (...) and slot_name in
(..). Earlier the slot_name filter was carrying all the names
mentioned in synchronize_slot_names (if it is not '*'). Now it
mentions only the ones belonging to its own dbids except during the
first run of the query. First run of the query is different since we
are getting this info ('which slot belongs to which db') from standby
only, thus the query will have all slots-names of
'synchronize_slot_names ' until slots are created on standby. This
one-time longer query seems better over pinging primary to get this
info.

Changes to be done/analysed next:
1) find a way to distinguish between user created logical slots and
synced ones. This is needed for below purposes:
a) Avoid dropping user created slots by slot-sync worker.
b) Unlike the user-created slots, synced slots should not be consumable.

2) Handling below corner scenarios:
a) if a worker is exiting due to change in sync_slot_names which made
dbids of that worker no longer valid, then that worker may leave
behind some slots which should otherwise be dropped.
b) if a worker is connected to a dbid and that dbid no longer exists.

3) Analyze if there is any interference with 'minimal logical decoding
on standby' feature.

thanks
Shveta

Attachment Content-Type Size
v13-0001-Allow-logical-walsenders-to-wait-for-physical-st.patch application/octet-stream 21.1 KB
v13-0002-Add-logical-slot-sync-capability-to-physical-sta.patch application/octet-stream 77.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2023-08-23 10:20:49 Re: PostgreSQL 16 release announcement draft
Previous Message Peter Eisentraut 2023-08-23 09:30:52 Re: EBCDIC sorting as a use case for ICU rules