Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Cc: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication
Date: 2026-02-26 06:19:53
Message-ID: CAA4eK1+CNnJ=p4yTXJKDYKLiJQttjCBdJsoioFVyt=JOxH6chA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 26, 2026 at 10:28 AM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>
>
> > >
> > > Proposal:
> > >
> > > Make synchronized_standby_slots quorum aware i.e. extend the GUC to accept an ANY M (slot1, slot2, ...) syntax similar to synchronous_standby_names, so StandbySlotsHaveCaughtup() can return true when M of N slots (where M <= N and M >= 1) have caught up. I still prefer two different GUCs for this as the list of slots to be synchronized can still be different (for example, DBA may want to ensure Geo standby to be sync before allowing the logical decoding client to read the changes). I kept synchronized_standby_slots parse logic similar to synchronous_standby_names to keep things simple. The default behavior is also not changed for synchronized_standby_slots.
> > >
...
>
> Thinking about this further, using quorum settings for
> synchronized_standby_slots can/will certainly result in at least one
> sync standby lagging behind the logical replica, making it probably
> impossible to continue with the existing logical replication setup
> after a failover to the standby that lags behind. Here is what I am
> mean:
>

But won't that be true even for synchronous_standby_names? I think in
the case of quorum, it is the responsibility of the failover solution
to select the most recent synced standby among all the standby's
specified in synchronous_standby_names. Similarly here before failing
over logical subscriber to one of physical standby, the failover tool
needs to ensure it is switching over to the synced replica. We have
given steps in the docs [1] that could be used to identify the replica
where the subscriber can switchover. Will that address your concern?

BTW, I have also suggested this idea in thread [2]. I don't recall all
the ideas/points discussed in that thread but it would be good to
check that thread for any alternative ideas and points raised, so that
we don't miss anything.

[1] - https://www.postgresql.org/docs/current/logical-replication-failover.html
[2] - https://www.postgresql.org/message-id/CAA4eK1KLFdmj8CLrZNL0D4phqyQihb7NXOjmqvrU5DT8moQn9Q%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2026-02-26 06:30:05 Re: Addition and subtraction operations for the interval and integer types
Previous Message Chao Li 2026-02-26 06:13:44 Re: Show comments in \dRp+, \dRs+, and \dX+ psql meta-commands