| From: | SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com> |
|---|---|
| To: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> |
| Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication |
| Date: | 2026-02-26 10:46:08 |
| Message-ID: | CAHg+QDct_WLz8b+JpxvBYVUptEMySME4HR6NOodc1uCK91525g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Ashutosh,
On Thu, Feb 26, 2026 at 1:11 AM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
wrote:
> Hi,
>
> On Thu, Feb 26, 2026 at 2:15 PM shveta malik <shveta(dot)malik(at)gmail(dot)com>
> wrote:
> >
> > On Thu, Feb 26, 2026 at 1:54 PM SATYANARAYANA NARLAPURAM
> > <satyanarlapuram(at)gmail(dot)com> wrote:
> > >
> > > Hi Ashutosh,
> > >
> > > On Wed, Feb 25, 2026 at 11:42 PM Ashutosh Sharma <
> ashu(dot)coek88(at)gmail(dot)com> wrote:
> > >>
> > >>
> > >> I don't think we should be comparing "synchronous_standby_names" with
> > >> "synchronized_standby_slots", even though they appear similar in
> > >> purpose. All values listed in synchronous_standby_names represent
> > >> synchronous standbys exclusively, whereas synchronized_standby_slots
> > >> can hold values for both synchronous and asynchronous standbys. In
> > >> other words, every server referenced by synchronous_standby_names is
> > >> of the same type, but that may not be the case with
> > >> synchronized_standby_slots.
> > >>
> > >> If a GUC can hold values of different types (sync vs. async), does it
> > >> really make sense to use a qualifier like ANY 1 (val1, val2) when val1
> > >> and val2 are different in nature? For example, suppose val1 is a
> > >> synchronous standby and val2 is an asynchronous standby, and we
> > >> configure ANY 1 (val1, val2). It's possible for val2 to get ahead of
> > >> val1 in terms of replication progress, which in turn could mean the
> > >> logical replica is also ahead of val1. So if we were to fail over to
> > >> val1 (since it's the only synchronous standby), we will not be able to
> > >> use the existing logical replication setup.
> > >
> > >
> > > If the failover orchestrator cannot ensure standby1 to not get the
> quorum committed WAL (from archive or standby2) then the setting ANY 1
> (val1, val2) is invalid.
> > > This setup also has issues because in your scenario, standby2 is ahead
> of the new primary (standby1) and standby2 requires now to rewind to be in
> sync with the new primary. Additionally, it allowed readers to read data
> that was lost at the end of the failover. We ideally need a mechanism to
> not send WAL to async replicas before the sync replicas commit (honoring
> syncrhnous_standby_names GUC) feature (similar to
> synchronized_standby_slots). It could be a different thread on its own.
> >
> >
> > +1 on the overall idea of the patch.
> > I understand the concern raised above that one of the standbys in the
> > quorum (synchronized_standby_slots) might lag behind the logical
> > replica, and a user could potentially failover to such a standby. But
> > I also agree with Amit that configuring failover correctly is
> > ultimately the responsibility of failover-solution. And instructions
> > in doc should be followed before deciding if a standby is
> > failover-ready or not.
> >
> > As suggested in [1], IMO, it is a reasonably good idea for
> > 'synchronized_standby_slots' to DEFAULT to the value of
> > 'synchronous_standby_names'. That way, even if the user missed to
> > configure 'synchronized_standby_slots' explicitly, we would still have
> > reasonable protection in place. At the same time, if a user
> > intentionally chooses not to configure it, a NULL/NONE value should
> > remain a valid option.
> >
>
> AFAIU, not all names listed in "synchronous_standby_names" are
> necessarily synchronous standbys. Tools like pg_receivewal, for
> example, can establish a replication connection to the primary and
> appear in that list. Therefore, deriving "synchronized_standby_slots"
> from "synchronous_standby_names", if not set by the user would cause
> logical slots to be synchronized to whatever nodes those names
> represent, including a host running pg_receivewal, which is certainly
> not something the user would have intended to do. Therefore I feel
> this might not just be the good choice.
Agreed, not a good idea to have synchronized_standby_slots default to
synchronous_standby_names because application_names and slot names are
different as stated.
Thanks,
Satya
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jim Jones | 2026-02-26 10:52:17 | Re: Show comments in \dRp+, \dRs+, and \dX+ psql meta-commands |
| Previous Message | SATYANARAYANA NARLAPURAM | 2026-02-26 10:38:44 | Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication |