| From: | Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com> |
|---|---|
| To: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Cc: | Ajin Cherian <itsajin(at)gmail(dot)com>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | RE: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication |
| Date: | 2026-03-19 06:38:04 |
| Message-ID: | TY4PR01MB16907E4E69204C3B318866AF8944FA@TY4PR01MB16907.jpnprd01.prod.outlook.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wednesday, March 18, 2026 6:38 PM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> PFA patch that addresses above comments.
Thanks for the patch! Overall, I think this is a valuable feature as I've heard
requests from many customers for a way to avoid blocking logical replication
when only some instances are down when using synchronized_standby_slots.
I didn't find any bugs in the patch, but I have a few comments on the code and
tests:
1.
> /*
> * Return true if value starts with explicit FIRST syntax:
> *
> * FIRST N (...)
> *
> * This is used to distinguish explicit FIRST from simple list syntax whose
> * first slot name may start with "first".
> */
> static bool
> IsPrioritySyncStandbySlotsSyntax(const char *value)
I think adding a new function to manually parse the list isn't the most elegant
approach. Instead, it would be cleaner to have a new flag that distinguishes
when a plain name list is specified, and use that to mark the case
appropriately like:
/* syncrep_method of SyncRepConfigData */
#define SYNC_REP_PRIORITY 0
#define SYNC_REP_QUORUM 1
+#define SYNC_REP_IMPLICIT 2
standby_config:
- standby_list { $$ = create_syncrep_config("1", $1, SYNC_REP_PRIORITY); }
+ standby_list { $$ = create_syncrep_config("1", $1, SYNC_REP_IMPLICIT); }
2.
+ /*
+ * Allocate array to track slot states. Size it to the total number of
+ * configured slots since in the worst case all could have problem states.
+ */
+ slot_states = (SyncStandbySlotsStateInfo *)
+ palloc(sizeof(SyncStandbySlotsStateInfo) * synchronized_standby_slots_config->nslotnames);
I personally prefer building the list incrementally with List * and lappend
rather than pre-allocating, since the list may often be empty in success cases.
3.
+<synopsis>
+[FIRST] <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="parameter">slot_name</replaceable> [, ...] )
+ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="parameter">slot_name</replaceable> [, ...] )
+<replaceable class="parameter">slot_name</replaceable> [, ...]
+</synopsis>
The documentation mentions that the FIRST keyword is optional, but the code
and comments don't seem consistent. Some comments give the impression that
FIRST is required:
+ * FIRST N (slot1, slot2, ...) -- wait for first N in priority order
Additionally, IsPrioritySyncStandbySlotsSyntax() only checks for the FIRST
keyword and doesn't handle the "N (slot1, slot2)" case where FIRST is omitted.
4.
Regarding the new tests, I think we can avoid testing the plain slot list
since that's already covered extensively in 040_standby_failover_slots_sync.pl.
Instead, we could focus on testing edge cases for ANY and FIRST. I noticed
there are no tests for scenarios where logical decoding blocks when using ANY or
FIRST. For example, testing FIRST 1 (s1, s2) where s1 is alive but hasn't caught
up would be valuable (if possible to test in tap-test, maybe test via
recovery_min_apply_delay ?). This logic is more error-prone and required more
careful review than other cases.
BTW, Part E seems unnecessary since the patch reuses the sync_standby_names
parser. ISTM, testing the first_ prefix in this context doesn't add valuable
coverage.
Best Regards,
Hou zj
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Bapat | 2026-03-19 06:53:51 | Re: SQL Property Graph Queries (SQL/PGQ) |
| Previous Message | Postgress Cybrosys | 2026-03-19 05:57:31 | Fix type of 'reduction' variable in _bt_singleval_fillfactor() |