| From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
|---|---|
| To: | Xuneng Zhou <xunengzhou(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
| Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
| Subject: | Re: Refactor recovery conflict signaling a little |
| Date: | 2026-03-07 11:00:01 |
| Message-ID: | 3e07149d-060b-48a0-8f94-3d5e4946ae45@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello Xuneng and Heikki,
04.03.2026 07:33, Xuneng Zhou wrote:
>> 03.03.2026 17:39, Heikki Linnakangas wrote:
>>> On 24/02/2026 10:00, Alexander Lakhin wrote:
>>>> The "terminating process ..." message doesn't appear when the test passes
>>>> successfully.
>>> Hmm, right, looks like something wrong in signaling the recovery conflict. I can't tell if the signal is being sent,
>>> or it's not processed correctly. Looking at the code, I don't see anything wrong.
>>>
> I was unable to reproduce the issue on an x86_64 Linux machine using
> the provided script. All test runs completed successfully without any
> failures.
I've added debug logging (see attached) and saw the following:
!!!SignalRecoveryConflict[282363]
!!!ProcArrayEndTransaction| pendingRecoveryConflicts = 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
2026-03-07 12:21:24.544 EET walreceiver[282421] FATAL: could not receive data from WAL stream: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2026-03-07 12:21:24.645 EET postmaster[282355] LOG: received immediate shutdown request
2026-03-07 12:21:24.647 EET postmaster[282355] LOG: database system is shut down
While for a successful run, I see:
2026-03-07 12:18:17.075 EET startup[285260] DETAIL: The slot conflicted with xid horizon 677.
2026-03-07 12:18:17.075 EET startup[285260] CONTEXT: WAL redo at 0/04022130 for Heap2/PRUNE_ON_ACCESS:
snapshotConflictHorizon: 677, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 2, nunused: 0, dead: [35, 36]; blkref
#0: rel 1663/16384/16418, blk 10
!!!SignalRecoveryConflict[285260]
!!!ProcessInterrupts[286071]| MyProc->pendingRecoveryConflicts: 16
!!!ProcessRecoveryConflictInterrupts[286071]
!!!ProcessRecoveryConflictInterrupts[286071] pending: 16, reason: 4
2026-03-07 12:18:17.075 EET walsender[286071] 035_standby_logical_decoding.pl ERROR: canceling statement due to
conflict with recovery
2026-03-07 12:18:17.075 EET walsender[286071] 035_standby_logical_decoding.pl DETAIL: User was using a logical
replication slot that must be invalidated.
(Full logs for this failed run and a good run are attached.)
Best regards,
Alexander
| Attachment | Content-Type | Size |
|---|---|---|
| 035_debugging.patch | text/x-patch | 2.7 KB |
| 035_logs.tar.bz2 | application/x-bzip2 | 7.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Álvaro Herrera | 2026-03-07 13:32:18 | Re: [BUG?] missing array index may result in a wrong constraint name (pg_dump, bin-upgrade, >=18) |
| Previous Message | Amit Langote | 2026-03-07 09:54:27 | Re: generic plans and "initial" pruning |