Re: Shutdown indefinitely stuck due to unflushed FPI_FOR_HINT record

From: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Shutdown indefinitely stuck due to unflushed FPI_FOR_HINT record
Date: 2026-03-17 16:51:40
Message-ID: CAO6_XqqaaejpuoQ5y04n6tX6ZdsWLtEyvyV3fEm2A9tSaWL5qw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 17, 2026 at 12:26 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> This stuff seems sensible enough that I think we should at least have
> a test, no? It does not have to be absolutely perfect in terms of
> reproducibility, just good enough to be able to detect it across the
> buildfarm. We already do various things with page boundaries in WAL
> during recovery, and a shutdown could be perhaps timed to increase the
> reproducibility rate of the issues discussed?

I initially thought that there was no easy way to trigger this issue
reliably in a test: the script I've been using won't work as soon as
there are changes in the record sizes. Then I remembered that
pg_logical_emit_message existed and could be used to write a WAL
record of a specific size, without allocating a xid and without
flushing the record.

With this, the test can be simplified to:
SELECT pg_switch_wal();
BEGIN;
SELECT pg_logical_emit_message(false, '', repeat('a', 16265), false);
ROLLBACK;

Any change in WAL short header, long header or xl_logical_message
struct will "break" the test since the record won't be at the exact
end of the page boundary. This also assumes that we have an 8 byte
alignment. 32 bits machine will have the WAL record ends at 3FF0, so
not exactly the end, but that should be fine to test different
conditions.

A word of caution about this test: While running it on my machine,
I've managed to trigger some weird WAL corruption. The new segment
after the switch had 1 or 2 excessive bytes at the start of the
segment just before the xlog page magic, shifting the whole file. The
first time it happened, I thought I'd messed something up and added
the bytes myself while looking at the WAL with imhex. The second time,
I've only run the script, and the new segment had a 1.1MB size shortly
after, so I'm pretty sure I didn't do anything that could have
introduced those excessive bytes.

I'm still trying to understand the trigger conditions (some race
condition between the switch and the walwriter?), but if this test is
merged, it may trigger this WAL corruption issue on the buildfarm.

Regards,
Anthonin Bonnefoy

Attachment Content-Type Size
v1-0001-Add-test-shutting-down-walsender-with-unflushed-r.patch application/octet-stream 2.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2026-03-17 16:56:24 Re: [19] CREATE SUBSCRIPTION ... SERVER
Previous Message Masahiko Sawada 2026-03-17 16:50:48 Re: POC: Parallel processing of indexes in autovacuum