Re: VM corruption on standby

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: VM corruption on standby
Date: 2025-08-13 11:15:03
Message-ID: CALdSSPiA-Jv0vXGrgG8+zd3s3Cg9azatSPR6pazUixcewAf-wg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 12 Aug 2025 at 13:00, I wrote:

> While this aims to find existing VM corruption (i mean, in PG <= 17),
> this reproducer does not seem to work on pg17. At least, I did not
> manage to reproduce this scenario on pg17.
>
> This makes me think this exact corruption may be pg18-only. Is it
> possible that AIO is somehow involved here?

First of all, the "corruption" is reproducible with io_method = sync,
so AIO is not under suspicion.
Then, I did a gdb session many times and I ended up with the
conclusion that this test is NOT a valid corruption reproducer.
So, the thing is, when you involve injection point logic, due how inj
points are implemented, you allow postgres to enter WaitLatch
function, which
has its own logic for postmaster death handling[0].

So, when we add injection point here, we allow this sequence of events
to happen:

1) INSERT enters `heap_insert`, modifies HEAP page, goes to inj point and hangs.
2) CHECKPOINT process tries to FLUSH this page and wiats
3) kill -9 to postmaster
4) INSERT wakes up on postmaster death, goes to [0] and releases all locks.
5) CHECKPOINT-er flushes the HEAP page to disk, causing corruption.

The thing is, this execution will NOT happen without inj points.

So, overall, injection points are not suitable for this critical
section testing (i think).

==
Off-list Andrey send me this patch:

```
diff --git a/src/backend/storage/ipc/waiteventset.c
b/src/backend/storage/ipc/waiteventset.c
index 7c0e66900f9..e89e1d115cb 100644
--- a/src/backend/storage/ipc/waiteventset.c
+++ b/src/backend/storage/ipc/waiteventset.c
@@ -1044,6 +1044,7 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
long cur_timeout = -1;

Assert(nevents > 0);
+ Assert(CritSectionCount == 0);

/*
* Initialize timeout if requested. We must record the current time so
```

The objective is to confirm our assumption that the WaitEventSetWait
call ought not to occur during critical sections. This patch causes
`make check` to fail, indicating that this assumption is incorrect.
Assertion breaks due to AdvanceXLInsertBuffer call (which uses
condvar logic) inside XlogInsertRecord.

I did not find any doc or other piece of information indicating
whether WaitEventSetWait and critical sections are allowed. But I do
thing this is bad, because we do not process interruptions during
critical sections, so it is unclear to me why we should handle
postmaster death any differently.

[0] https://github.com/postgres/postgres/blob/393e0d2314050576c9c039853fdabe7f685a4f47/src/backend/storage/ipc/waiteventset.c#L1260-L1261

--
Best regards,
Kirill Reshke

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message vignesh C 2025-08-13 10:56:58 Re: Logical Replication of sequences