| From: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> |
|---|---|
| To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication |
| Date: | 2026-06-04 07:36:09 |
| Message-ID: | CAE9k0Pkk6q72X3Rc3MUo7PxU46UcCzLfMhM02PGDUmAue9cDGg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Thu, Jun 4, 2026 at 9:14 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Wed, Jun 3, 2026 at 4:30 PM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> >
> > Hi Shveta,
> >
> > On Fri, May 15, 2026 at 9:28 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > >
> > > Ashutosh, while testing further, I noticed that
> > > 'synchronized_standby_slots' does not filter duplicate entries. As an
> > > example, if user ends up giving one entry twice in priority
> > > configuration, then we will end up waiting on one slot twice rather
> > > than waiting on 2 different slots.
> > >
> > > Example:
> > > alter system set synchronized_standby_slots = 'FIRST 2 (standby_1,
> > > standby_1, standby_2, standby_3)';
> > > select pg_reload_conf();
> > > insert into tab1 values (10), (20), (30);
> > > select pg_logical_slot_get_binary_changes('sub1', NULL, NULL,
> > > 'proto_version', '4', 'publication_names', 'pub1');
> > >
> > > The last statement works even though standby_2 and standby_3 do not
> > > exist. It consumes standby_1 twice and thinks that the required number
> > > of slots has caught-up.
> > >
> > > OTOH, if we use the same configuration for
> > > 'synchronous_standby_names', it correctly waits for standby_2 and does
> > > not count on standby_1 twice.
> > >
> > > alter system set synchronous_standby_names = 'FIRST 2 (standby_1,
> > > standby_1, standby_2, standby_3)';
> > > insert into tab1 values (10), (20), (30); ----> This will wait on standby_2
> > >
> > > This is perhaps because 'synchronous_standby_names ' waits on active
> > > WAL senders rather than repeated strings in configuration. But our
> > > code changes wait on the names present in 'synchronized_standby_slots'
> > > without filtering out duplicates.
> > >
> >
> > May I know what your expectation is here? Would you like the check
> > hook for synchronized_standby_slots to automatically resolve
> > duplicates into a unique set of values, or should it detect duplicate
> > entries and raise an error so that the user can correct the
> > configuration?
> >
> > If we automatically resolve duplicates, the user would still see the
> > GUC configured exactly as they specified, even though it would not
> > function the same way internally. For example, if a user sets:
> >
> > FIRST 2 (s1, s1, s1, s2)
> >
> > it might internally be resolved to:
> >
> > FIRST 2 (s1, s2)
> >
> > However, when the user runs SHOW, it would still display the original
> > configuration. This could give the user an incorrect impression of how
> > the setting is actually being interpreted. Because of this, I feel we
> > should treat duplicate entries as an invalid configuration and raise
> > an error.
> >
> > As far as synchronous_standby_names is concerned, I can see that
> > configurations such as:
> >
> > FIRST 2 (s1, s1, s1, s1)
> >
> > are currently accepted, which I don't think is correct either and
> > should have been rejected, possibly resulted in the server startup
> > failure.
> >
>
> My preference, and original intent, was to accept duplicate entries
> and skip them internally. Doc can be updated to say 'duplicate entries
> are skipped'. A server startup failure due to duplicate entries in a
> GUC does not seem right to me. If the alter-system command fails due
> to duplicate entries, that is still fine, but a startup failure seems
> excessive. But let's see what others have to say on this.
>
Okay, the attached patch adds the capability to automatically remove
duplicate entries from the synchronized_standby_slots list. In N of M
mode, if N > M after removing duplicate entries, an error is raised.
This behavior has been documented, and test cases verifying the change
have been added.
A few other minor comments from [1] have also been addressed. Please
have a look at the attached patches with these changes.
--
With Regards,
Ashutosh Sharma.
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-Refactor-syncrep-parsing-to-represent-bare-standby-l.patch | application/octet-stream | 3.1 KB |
| 0003-Add-FIRST-N-and-N-.-priority-syntax-to-synchronized_.patch | application/octet-stream | 22.8 KB |
| 0002-Add-ANY-N-semantics-to-synchronized_standby_slots.patch | application/octet-stream | 42.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Imran Zaheer | 2026-06-04 08:05:20 | Fix comments to reference xlogrecovery.c |
| Previous Message | Dilip Kumar | 2026-06-04 07:25:22 | Re: Proposal: Conflict log history table for Logical Replication |