Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication
Date: 2026-03-18 10:38:27
Message-ID: CAE9k0PnfF27vzVnHoCxTZwbBH4yR1OMCw2MDdWc8k+bGd5sR0g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks again for reviewing the test cases. Please find my comments inline below:

On Wed, Mar 18, 2026 at 12:01 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
>
> Few comments:
>
> 1)
> +
> +# Physical replication slots for the two standbys
> +$primary->safe_psql('postgres',
> + "SELECT pg_create_physical_replication_slot('sb1_slot');");
> +$primary->safe_psql('postgres',
> + "SELECT pg_create_physical_replication_slot('sb2_slot');");
> +$primary->safe_psql('postgres',
> + "SELECT pg_create_physical_replication_slot('first_slot');");
>
> Can we please write a comment atop 'first_slot' creation for its purpose.
> The comment '+# Physical replication slots for the two standbys'
> confuses as we are creating 3 slots following that comment.
>

Moved this to PART E (the test-case where this slot is being used).
This is also suggested in one of your comments below.

> 2)
> +# PART A: Plain list (ALL mode) blocks when any slot is unavailable
>
> +# PART C: Verify plain list (ALL mode) requires ALL slots
>
> I think we can merge the 2 tests. Is there a specific reason we have
> it in 2 different sections (am I missing something?).
>
> Part A can first test blocking on inactive standby (which it is doing
> already) and then restart standby and can check that logical decoding
> now proceeds. Part C can then be removed.
>

I did not remove PART C entirely, however, I removed the portions that
overlapped with PART A.

I would suggest not just evaluating each test case independently, it
would be useful to also consider the overall flow to understand why
PART B follows PART A, and so on.

Viewed sequentially, PART A tests a unique scenario while also setting
up the environment needed for PART B (quorum mode). Similarly, PART C
covers a different scenario and prepares the setup for the test case
in PART D. This approach helps avoid unnecessary stop/start cycles of
the synchronous standby.

> 3)
> Same with these 2:
> +# PART B: ANY mode (quorum) — logical decoding proceeds with N-of-M slots
>
> +# PART D: ANY mode — verify it works with either standby
>
> Part B tests that it works when standby1 is down.
> Part D tests that it works when standby2 is down and also works when
> both are up.
>
> I think we don't need these duplicate test cases (unless I am missing
> something?), we can merge Part D to Part B itself i.e. test with
> standby1 down and up in a single test.
>

Removed PART D. It was almost similar to PART B.

> 4)
> These 2 tests are basically testing the same thing. Do we need both?
> +# FIRST 1 should skip sb1_slot (unavailable) and use sb2_slot.
>
> +# Test that FIRST 1 is different from plain list — FIRST 1 succeeds
> with one down.
>

Yes, they are actually the same. Hence removed the latter one.

>
> 5)
> I see that we only need 'first_slot' for the last test and do not have
> even a standby associated with it. So we can even move it's creation
> to that test itself instead of creating it in the beginning and adding
> comments to explain 'why'.
>

This is related to comment 1 above.

PFA patch that addresses above comments.

--
With Regards,
Ashutosh Sharma.

Attachment Content-Type Size
v20260318-0001-Add-FIRST-N-and-ANY-N-syntax-support-to-synchronized.patch application/octet-stream 43.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Kuzmenkov 2026-03-18 10:42:53 Re: Fix uninitialized xl_running_xacts padding
Previous Message lakshmi 2026-03-18 10:37:01 Re: parallel data loading for pgbench -i