Quick Links

RE: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication

From:	"Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To:	Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc:	Japin Li <japinli(at)hotmail(dot)com>, surya poondla <suryapoondla4(at)gmail(dot)com>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	RE: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication
Date:	2026-04-08 02:09:18
Message-ID:	TYRPR01MB14195AAC05C7D97F1EE5B9FDF945BA@TYRPR01MB14195.jpnprd01.prod.outlook.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tuesday, April 7, 2026 9:54 PM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>
> On Tue, Apr 7, 2026 at 5:18 PM shveta malik <shveta(dot)malik(at)gmail(dot)com>
> wrote:
> >
> > On Tue, Apr 7, 2026 at 3:56 PM Ashutosh Sharma
> <ashu(dot)coek88(at)gmail(dot)com> wrote:
> > >
> > > Hi,
> > >
> > > On Tue, Apr 7, 2026 at 11:20 AM Ashutosh Sharma
> <ashu(dot)coek88(at)gmail(dot)com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Tue, Apr 7, 2026 at 9:04 AM shveta malik
> <shveta(dot)malik(at)gmail(dot)com> wrote:
> > > > >
> > > > >
> > > > > I see your point. I agree that using
> > > > > wal_receiver_status_interval for this test may not be a reliable
> > > > > way. Can we attempt using
> > > > > pg_wal_replay_pause() on standby and then checking
> > > > > wait_event=WaitForStandbyConfirmation with
> > > > > backend_type=walsender on primary? Or do you see any issues in
> > > > > this approach that I might be overlooking?
> > > > >
> > > >
> > > > Yes, I think we can make use of the WAL replay pause/resume
> mechanism.
> > > > This seems like the right approach, as it gives us a more
> > > > controlled and deterministic way to validate the lagging behavior.
> > > >
> > >
> > > Looking at 049_wait_for_lsn.pl (the test case you referenced), it
> > > explicitly stops the WAL receiver by setting primary_conninfo to an
> > > empty string, rather than just pausing WAL replay.
> >
> > Oh, I missed it in that testcase. Setting primary_conninfo to NULL
> > essentially means not starting the walreceiver and thus making the
> > standby slot as inactive, for which we already have a testcase.
> >
> > > Using
> > > pg_wal_replay_pause() alone only halts replay; the WAL receiver
> > > continues running, keeps receiving WAL, and sends feedback/status to
> > > the primary. That feedback is sufficient to advance restart_lsn on
> > > the standby’s slot, which would violate the restart_lsn <
> > > wait_for_lsn condition inside StandbySlotsHaveCaughtup(), which is
> > > not what we want.
> >
> > Yes, I see. IIUC, the same problem will be there if we use
> > recovery_min_apply_delay i.e., WALs will be received, flushed and
> > feedback will be sent to primary, only replay will be delayed. We can
> > use 'synchronous_commit = remote_apply' along with
> > 'recovery_min_apply_delay ' but that would mean delaying logical
> > replication because transaction commit is blocking not because standby
> > is actually lagging. It will not be a suitable test for
> > 'synchronized_satndby_slots'.
> >
>
> Even with synchronous_commit = remote_apply and paused replay, standby
> can still send replies to the primary updating the slot's restart_lsn.

If we only want to keep the slot active without advancing restart_lsn, we could
start a replication connection and then acquire the slot with the help of
the replication command: START_REPLICATION SLOT physical 0/01788488;

E.g.,

$standby->psql(
'postgres',
qq[START_REPLICATION SLOT physical 0/01788488;],
replication => 'database');

Best Regards,
Hou zj

In response to

Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication at 2026-04-07 13:53:30 from Ashutosh Sharma

Responses

Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication at 2026-04-08 11:52:25 from Ashutosh Sharma

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Chao Li	2026-04-08 02:09:51	Re: updates for handling optional argument in system functions
Previous Message	Thomas Munro	2026-04-08 02:09:16	Re: Automatically sizing the IO worker pool