| From: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> |
|---|---|
| To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Cc: | SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication |
| Date: | 2026-02-26 09:11:42 |
| Message-ID: | CAE9k0PnOUth5tjT21wD75QRUsREQ35=z9JgqOFVUdCLrQ62s3g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Thu, Feb 26, 2026 at 2:15 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Thu, Feb 26, 2026 at 1:54 PM SATYANARAYANA NARLAPURAM
> <satyanarlapuram(at)gmail(dot)com> wrote:
> >
> > Hi Ashutosh,
> >
> > On Wed, Feb 25, 2026 at 11:42 PM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> >>
> >>
> >> I don't think we should be comparing "synchronous_standby_names" with
> >> "synchronized_standby_slots", even though they appear similar in
> >> purpose. All values listed in synchronous_standby_names represent
> >> synchronous standbys exclusively, whereas synchronized_standby_slots
> >> can hold values for both synchronous and asynchronous standbys. In
> >> other words, every server referenced by synchronous_standby_names is
> >> of the same type, but that may not be the case with
> >> synchronized_standby_slots.
> >>
> >> If a GUC can hold values of different types (sync vs. async), does it
> >> really make sense to use a qualifier like ANY 1 (val1, val2) when val1
> >> and val2 are different in nature? For example, suppose val1 is a
> >> synchronous standby and val2 is an asynchronous standby, and we
> >> configure ANY 1 (val1, val2). It's possible for val2 to get ahead of
> >> val1 in terms of replication progress, which in turn could mean the
> >> logical replica is also ahead of val1. So if we were to fail over to
> >> val1 (since it's the only synchronous standby), we will not be able to
> >> use the existing logical replication setup.
> >
> >
> > If the failover orchestrator cannot ensure standby1 to not get the quorum committed WAL (from archive or standby2) then the setting ANY 1 (val1, val2) is invalid.
> > This setup also has issues because in your scenario, standby2 is ahead of the new primary (standby1) and standby2 requires now to rewind to be in sync with the new primary. Additionally, it allowed readers to read data that was lost at the end of the failover. We ideally need a mechanism to not send WAL to async replicas before the sync replicas commit (honoring syncrhnous_standby_names GUC) feature (similar to synchronized_standby_slots). It could be a different thread on its own.
>
>
> +1 on the overall idea of the patch.
> I understand the concern raised above that one of the standbys in the
> quorum (synchronized_standby_slots) might lag behind the logical
> replica, and a user could potentially failover to such a standby. But
> I also agree with Amit that configuring failover correctly is
> ultimately the responsibility of failover-solution. And instructions
> in doc should be followed before deciding if a standby is
> failover-ready or not.
>
> As suggested in [1], IMO, it is a reasonably good idea for
> 'synchronized_standby_slots' to DEFAULT to the value of
> 'synchronous_standby_names'. That way, even if the user missed to
> configure 'synchronized_standby_slots' explicitly, we would still have
> reasonable protection in place. At the same time, if a user
> intentionally chooses not to configure it, a NULL/NONE value should
> remain a valid option.
>
AFAIU, not all names listed in "synchronous_standby_names" are
necessarily synchronous standbys. Tools like pg_receivewal, for
example, can establish a replication connection to the primary and
appear in that list. Therefore, deriving "synchronized_standby_slots"
from "synchronous_standby_names", if not set by the user would cause
logical slots to be synchronized to whatever nodes those names
represent, including a host running pg_receivewal, which is certainly
not something the user would have intended to do. Therefore I feel
this might not just be the good choice.
--
With Regards,
Ashutosh Sharma.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2026-02-26 09:21:13 | Re: guc: make dereference style consistent in check_backtrace_functions |
| Previous Message | Ilia Evdokimov | 2026-02-26 08:57:26 | Re: Hash-based MCV matching for large IN-lists |