| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com> |
| Cc: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Subject: | Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication |
| Date: | 2026-02-26 08:45:12 |
| Message-ID: | CAJpy0uAMFqLjKMD4h3uuXBGXpGzeUBFQe80RJ9b26YQ3CXZbog@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Feb 26, 2026 at 1:54 PM SATYANARAYANA NARLAPURAM
<satyanarlapuram(at)gmail(dot)com> wrote:
>
> Hi Ashutosh,
>
> On Wed, Feb 25, 2026 at 11:42 PM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>>
>>
>> I don't think we should be comparing "synchronous_standby_names" with
>> "synchronized_standby_slots", even though they appear similar in
>> purpose. All values listed in synchronous_standby_names represent
>> synchronous standbys exclusively, whereas synchronized_standby_slots
>> can hold values for both synchronous and asynchronous standbys. In
>> other words, every server referenced by synchronous_standby_names is
>> of the same type, but that may not be the case with
>> synchronized_standby_slots.
>>
>> If a GUC can hold values of different types (sync vs. async), does it
>> really make sense to use a qualifier like ANY 1 (val1, val2) when val1
>> and val2 are different in nature? For example, suppose val1 is a
>> synchronous standby and val2 is an asynchronous standby, and we
>> configure ANY 1 (val1, val2). It's possible for val2 to get ahead of
>> val1 in terms of replication progress, which in turn could mean the
>> logical replica is also ahead of val1. So if we were to fail over to
>> val1 (since it's the only synchronous standby), we will not be able to
>> use the existing logical replication setup.
>
>
> If the failover orchestrator cannot ensure standby1 to not get the quorum committed WAL (from archive or standby2) then the setting ANY 1 (val1, val2) is invalid.
> This setup also has issues because in your scenario, standby2 is ahead of the new primary (standby1) and standby2 requires now to rewind to be in sync with the new primary. Additionally, it allowed readers to read data that was lost at the end of the failover. We ideally need a mechanism to not send WAL to async replicas before the sync replicas commit (honoring syncrhnous_standby_names GUC) feature (similar to synchronized_standby_slots). It could be a different thread on its own.
+1 on the overall idea of the patch.
I understand the concern raised above that one of the standbys in the
quorum (synchronized_standby_slots) might lag behind the logical
replica, and a user could potentially failover to such a standby. But
I also agree with Amit that configuring failover correctly is
ultimately the responsibility of failover-solution. And instructions
in doc should be followed before deciding if a standby is
failover-ready or not.
As suggested in [1], IMO, it is a reasonably good idea for
'synchronized_standby_slots' to DEFAULT to the value of
'synchronous_standby_names'. That way, even if the user missed to
configure 'synchronized_standby_slots' explicitly, we would still have
reasonable protection in place. At the same time, if a user
intentionally chooses not to configure it, a NULL/NONE value should
remain a valid option.
Thanks,
Shveta
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2026-02-26 08:48:25 | Re: DOCS - Add introductory paragraph to Getting Started chapter |
| Previous Message | SATYANARAYANA NARLAPURAM | 2026-02-26 08:23:45 | Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication |