| From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
|---|---|
| To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com> |
| Subject: | Re: BUG: Former primary node might stuck when started as a standby |
| Date: | 2026-02-17 06:00:00 |
| Message-ID: | 63e55743-669e-4300-a561-7b7ff63723b6@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Dear Kuroda-san,
16.02.2026 05:10, Hayato Kuroda (Fujitsu) wrote:
>> From my old records, 009_twophase.pl failed exactly due to background (
>> namely, bgwriter's) activity.
> Okay, so I think there are two reasons why the test could fail.
>
> 1) old primary shut down before all changes are replicated. This can avoid by
> adding wait_for_replay_catchup() before the tearing down.
> 2) bgwriter on old primary generated the RUNNING_XACTS record and the node shut
> dwon before sending it.
>
> .. and you mentioned for the case 2), right? I recalled that an injection point
> "skip-log-running-xacts" can be used to supress generating the WAL record, see
> 035_standby_logical_decoding.pl. My idea is to attach the injeciton point before
> the switchover and avoid adding the record.
> Attached patch implements the idea.
>
> How do you feel?
Unfortunately, the testing procedure I shared above still produces failures
with the patched 009_twophase.pl. With
wal_debug = on
in /tmp/temp.config, I can see:
...
ITERATION 3
...
11 t/009_twophase.pl .. ok
11 All tests successful.
11 Files=1, Tests=30, 10 wallclock secs ( 0.02 usr 0.00 sys + 0.27 cusr 0.63 csys = 0.92 CPU)
11 Result: PASS
20 make: *** [Makefile:28: check] Terminated
parallel: This job failed:
TEMP_CONFIG=/tmp/temp.config PROVE_TESTS=t/009* NO_TEMP_INSTALL=1 timeout 60 make check -s -C src/test/recovery_20
2026-02-17 07:06:44.313 EET client backend[754908] 009_twophase.pl LOG: statement: PREPARE TRANSACTION 'xact_009_10';
2026-02-17 07:06:44.313 EET client backend[754908] 009_twophase.pl LOG: INSERT @ 0/030227F8: - Transaction/PREPARE:
gid xact_009_10: 2026-02-17 07:06:44.313023+02; subxacts: 790
2026-02-17 07:06:44.313 EET client backend[754908] 009_twophase.pl STATEMENT: PREPARE TRANSACTION 'xact_009_10';
2026-02-17 07:06:44.313 EET client backend[754908] 009_twophase.pl LOG: xlog flush request 0/030227F8; write
0/00000000; flush 0/00000000
2026-02-17 07:06:44.313 EET client backend[754908] 009_twophase.pl STATEMENT: PREPARE TRANSACTION 'xact_009_10';
2026-02-17 07:06:44.313 EET background writer[754333] LOG: INSERT @ 0/03022838: - Standby/RUNNING_XACTS: nextXid 791
latestCompletedXid 788 oldestRunningXid 789; 1 xacts: 789; 1 subxacts: 790
2026-02-17 07:06:44.322 EET client backend[754949] 009_twophase.pl LOG: statement: SELECT pg_current_wal_flush_lsn()
2026-02-17 07:06:44.330 EET client backend[754974] 009_twophase.pl LOG: statement: SELECT '0/030227F8' <= replay_lsn
AND state = 'streaming'
FROM pg_catalog.pg_stat_replication
WHERE application_name IN ('paris', 'walreceiver')
2026-02-17 07:06:44.337 EET postmaster[754306] LOG: received immediate shutdown request
2026-02-17 07:06:44.339 EET postmaster[754306] LOG: database system is shut down
...
2026-02-17 07:06:44.566 EET startup[755326] LOG: REDO @ 0/03022618; LSN 0/03022650: prev 0/030225E0; xid 0; len 24 -
Standby/RUNNING_XACTS: nextXid 789 latestCompletedXid 788 oldestRunningXid 789
2026-02-17 07:06:44.566 EET startup[755326] LOG: REDO @ 0/03022650; LSN 0/030226A0: prev 0/03022618; xid 789; len 3;
blkref #0: rel 1663/5/16384, blk 0 - Heap/INSERT: off: 21, flags: 0x00
2026-02-17 07:06:44.566 EET startup[755326] LOG: REDO @ 0/030226A0; LSN 0/030226F0: prev 0/03022650; xid 790; len 3;
blkref #0: rel 1663/5/16384, blk 0 - Heap/INSERT: off: 22, flags: 0x00
2026-02-17 07:06:44.566 EET startup[755326] LOG: REDO @ 0/030226F0; LSN 0/030227F8: prev 0/030226A0; xid 789; len 232 -
Transaction/PREPARE: gid xact_009_10: 2026-02-17 07:06:44.313023+02; subxacts: 790
2026-02-17 07:06:44.566 EET startup[755326] LOG: REDO @ 0/030227F8; LSN 0/03022838: prev 0/030226F0; xid 0; len 32 -
Standby/RUNNING_XACTS: nextXid 791 latestCompletedXid 788 oldestRunningXid 789; 1 xacts: 789; 1 subxacts: 790
2026-02-17 07:06:44.566 EET startup[755326] LOG: invalid record length at 0/03022838: expected at least 24, got 0
2026-02-17 07:06:44.566 EET startup[755326] LOG: consistent recovery state reached at 0/03022838
2026-02-17 07:06:44.566 EET postmaster[755283] LOG: database system is ready to accept read-only connections
2026-02-17 07:06:44.569 EET walreceiver[755338] LOG: fetching timeline history file for timeline 2 from primary server
2026-02-17 07:06:44.569 EET walreceiver[755338] LOG: started streaming WAL from primary at 0/03000000 on timeline 1
2026-02-17 07:06:44.569 EET walreceiver[755338] LOG: replication terminated by primary server
2026-02-17 07:06:44.569 EET walreceiver[755338] DETAIL: End of WAL reached on timeline 1 at 0/030227F8.
2026-02-17 07:06:44.570 EET startup[755326] LOG: new timeline 2 forked off current database system timeline 1 before
current recovery point 0/03022838
...
Regarding the approach with "skip-log-running-xacts", I wonder if there is
a strong guarantee that no other WAL write can happen in the same window
(e.g., from autovacuum)?
Best regards,
Alexander
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andreas Karlsson | 2026-02-17 06:08:17 | Re: add warning upon successful md5 password auth |
| Previous Message | Fujii Masao | 2026-02-17 05:55:29 | Re: [PATCH] Expose checkpoint reason to completion log messages. |