RE: BUG: Former primary node might stuck when started as a standby

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Alexander Lakhin' <exclusion(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>
Subject: RE: BUG: Former primary node might stuck when started as a standby
Date: 2026-02-16 03:10:46
Message-ID: OS9PR01MB121498EFA4CBF3003B83C9BCCF56CA@OS9PR01MB12149.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Alexander,

> From my old records, 009_twophase.pl failed exactly due to background (
> namely, bgwriter's) activity.

Okay, so I think there are two reasons why the test could fail.

1) old primary shut down before all changes are replicated. This can avoid by
adding wait_for_replay_catchup() before the tearing down.
2) bgwriter on old primary generated the RUNNING_XACTS record and the node shut
dwon before sending it.

.. and you mentioned for the case 2), right? I recalled that an injection point
"skip-log-running-xacts" can be used to supress generating the WAL record, see
035_standby_logical_decoding.pl. My idea is to attach the injeciton point before
the switchover and avoid adding the record.
Attached patch implements the idea.

How do you feel?

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v1-0001-Stabilize-009_twophase.pl.patch application/octet-stream 4.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2026-02-16 03:20:36 Re: Skipping schema changes in publication
Previous Message Michael Paquier 2026-02-16 01:49:14 Re: Having problems generating a code coverage report