Quick Links

Re: BUG: Former primary node might stuck when started as a standby

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>
Subject:	Re: BUG: Former primary node might stuck when started as a standby
Date:	2026-03-04 08:00:00
Message-ID:	9d4abbe2-95aa-47e0-9ce2-842196a662a7@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hello Michael,

04.03.2026 07:31, Michael Paquier wrote
>> I guess so. cluster::stop does the `pg_ctl stop -m fast` command. In this case
>> the walsender waits till there are nothing to be sent, see WalSndLoop().
>> Do let me know if you have observed the similar failure here.
> Exactly. Doing a clean stop of the primary offers a strong guarantee
> here. We are sure that the standby will have received all the records
> from the primary. Timeline forking is an impossible thing in
> 012_subtransactions.pl based on how the switchover from the primary to
> the standby happens. I don't see a need for tweaking this test at
> all. Or perhaps you did see a failure of some kind in this test,
> Alexander?

Yes, 012_subtransactions doesn't fail with aggressive bgwriter, as I noted
before. I mentioned it exactly to show that stop does matter here. But if
we recognize teardown_node in this context as risky, maybe it would make
sense to review also other tests in recovery/. I already wrote about
004_timeline_switch, but probably there are more. E.g., 028_pitr_timelines
(I haven't tested it intensively yet) does:
$node_primary->stop('immediate');

# Promote the standby, and switch WAL so that it archives a WAL segment
# that contains all the INSERTs, on a new timeline.
$node_standby->promote;

Best regards,
Alexander

In response to

Re: BUG: Former primary node might stuck when started as a standby at 2026-03-04 05:31:29 from Michael Paquier

Responses

Re: BUG: Former primary node might stuck when started as a standby at 2026-03-04 08:23:29 from Michael Paquier

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	paul.bunn	2026-03-04 08:00:39	[BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path
Previous Message	Anthonin Bonnefoy	2026-03-04 07:38:24	Re: Don't keep closed WAL segment in page cache after replay