| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Subject: | Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication |
| Date: | 2026-04-15 06:47:17 |
| Message-ID: | CAJpy0uCGw_3OXFqO6OHAKRPei_xkogaVJBHfvS-RQppxMAGdMg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Apr 15, 2026 at 12:00 PM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>
> Hi Shveta,
>
> On Wed, Apr 15, 2026 at 11:09 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Tue, Apr 14, 2026 at 6:01 PM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> > >
> > >
> > > Sure. Attached is the 0003 patch, based on Hou-San's suggestion, which
> > > refactors the syncrep parser to explicitly represent bare standby
> > > lists (i.e. those without the ANY or FIRST keyword). I've also
> > > attached the two preceding patches for completeness.
> > >
> > > --
> >
> > Thanks Ashutosh. I was testing 001 alone (assuming we would like to
> > commit it standalone).
> >
> > This part seems problematic in standlaone 001 patch:
> >
> > + /*
> > + * For synchronized_standby_slots, a comma-separated list means all
> > + * listed slots are required. The parser returns num_sync=1 in this
> > + * shape, so map it to nmembers to enforce all-mode semantics.
> > + */
> > + if (syncrep_parse_result->num_sync == 1 &&
> > + syncrep_parse_result->syncrep_method == SYNC_REP_PRIORITY &&
> > + syncrep_parse_result->nmembers > 1)
> > + syncrep_parse_result->num_sync = syncrep_parse_result->nmembers;
> >
> >
> > With this, 'FIRST 1(standby_1, standby_2)' is convered to wait_for_all
> > mode, which is not correct and it keeps wiating for standby_2 when
> > standby_1 has already taken changes. I am not sure what is correct way
> > to deal with it when it comes to first patch alone. My expectation was
> > that FIRST-syntax is blocked i.e. it errors out instead of partially
> > implemented and misbehaving. But if we plan to do so, the challenge
> > will be how to distinguish actual FIRST and comma separated list
> > implicitly converted to 'FIRST 1' by syncrep parser. For that we will
> > need either 003 or IsPrioritySyncStandbySlotsSyntax', thus defeating
> > the whole purpose of separating the patches. What do you think on
> > this? 001 is okay as is or we shall block FIRST?
> >
>
> AFAICS, we should first finalize the synchronous replication parser
> changes to give an identity to the plain list mode. Once that is
> settled, we may need to reorder the patch based on the decision we
> take. If we decide to proceed with the parser changes, my
> understanding is that they should come first, followed by the
> implementation of support for the ANY/FIRST clauses.
I agree. I do not see any other correct way to do it.
A few trivial comments on 001:
1)
+# B) ANY 1 (sb1_slot, sb2_slot) (quorum mode)
+# - Proceeds when at least N slots have caught up
+# - Skips missing/invalid/logical slots and lagging slots
(inactive or active)
+# to find N caught-up slots
We can mention 'ANY N' instead of 'ANY 1' so that the explanation
using 'N' makes more sense.
2)
+sub poll_query_until_quiet
+{
+ my ($node, $dbname, $query, $expected, $timeout_secs) = @_;
+
+ $expected = 't' unless defined($expected);
+ $timeout_secs = $PostgreSQL::Test::Utils::timeout_default
We can get rid of this function now.
thanks
Shveta
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Lakshmi N | 2026-04-15 06:58:01 | Typo fixes in Graph table files |
| Previous Message | SATYANARAYANA NARLAPURAM | 2026-04-15 06:44:23 | [PATCH] Fix WAIT FOR LSN standby_write/standby_flush for archive recovery cases |