Re: Synchronizing slots from primary to standby

From: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-02-08 06:14:05
Message-ID: CABdArM6ryvHgXcarVgVJAHt-Ygxs48Ua5N5v=jdnHkrDY8vOwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We conducted stress testing for the patch with a setup of one primary
node with 100 tables and five subscribers, each having 20
subscriptions. Then created three physical standbys syncing the
logical replication slots from the primary node.
All 100 slots were successfully synced on all three standbys. We then
ran the load and monitored LSN convergence using the prescribed SQL
checks.
Once the standbys were failover-ready, we were able to successfully
promote one of the standbys and all the subscribers seamlessly
migrated to the new primary node.

We replicated the tests with 200 tables, creating 200 logical
replication slots. With the increased load, all the tests were
completed successfully.

Minor errors (not due to patch) observed during tests -

1) When the load was run, on subscribers, the logical replication
apply workers started failing due to timeout. This is not related to
the patch as it happened due to the small "wal_receiver_timeout"
setting w.r.t. the load. To confirm, we ran the same load without the
patch too, and the same failure happened.
2) There was a buffer overflow exception on the primary node with the
'200 replication slots' case. It was not related to the patch as it
was due to short memory configuration.

All the tests were done on Windows as well as Linux environments.
Thank you Ajin for the stress test and analysis on Linux.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2024-02-08 06:35:55 RE: Synchronizing slots from primary to standby
Previous Message Ashutosh Bapat 2024-02-08 06:04:47 Re: Fix propagation of persistence to sequences in ALTER TABLE / ADD COLUMN