Re: VM corruption on standby

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Kirill Reshke <reshkekirill(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)tigerdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: VM corruption on standby
Date: 2025-08-19 13:17:46
Message-ID: 2e220a95-5646-4ac1-ae13-762f960ea3f7@postgrespro.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

19.08.2025 16:09, Andres Freund пишет:
> Hi,
>
> On 2025-08-19 15:56:05 +0300, Yura Sokolov wrote:
>> 09.08.2025 22:54, Kirill Reshke пишет:
>>> On Thu, 7 Aug 2025 at 21:36, Aleksander Alekseev
>>> <aleksander(at)tigerdata(dot)com> wrote:
>>>
>>>> Perhaps there was a good
>>>> reason to update the VM *before* creating WAL records I'm unaware of.
>>>
>>> Looks like 503c730 intentionally does it this way; however, I have not
>>> yet fully understood the reasoning behind it.
>>
>> I repeat: there was no intention. Neither in commit message, nor in
>> discussion about.
>>
>> There was intention to move visibilitymap_clear under heap page lock and
>> into critical section, but there were no any word about logging.
>>
>> I believe, it was just an unfortunate oversight that the change is made
>> before logging.
>
> The normal pattern *is* to modify the buffer while holding an exclusive lock,
> in a critical section, before WAL logging. Look at
> src/backend/access/transam/README:
>
>> The general schema for executing a WAL-logged action is
>>
>> 1. Pin and exclusive-lock the shared buffer(s) containing the data page(s)
>> to be modified.
>>
>> 2. START_CRIT_SECTION() (Any error during the next three steps must cause a
>> PANIC because the shared buffers will contain unlogged changes, which we
>> have to ensure don't get to disk. Obviously, you should check conditions
>> such as whether there's enough free space on the page before you start the
>> critical section.)
>>
>> 3. Apply the required changes to the shared buffer(s).
>>
>> 4. Mark the shared buffer(s) as dirty with MarkBufferDirty(). (This must
>> happen before the WAL record is inserted; see notes in SyncOneBuffer().)
>> Note that marking a buffer dirty with MarkBufferDirty() should only
>> happen iff you write a WAL record; see Writing Hints below.
>>
>> 5. If the relation requires WAL-logging, build a WAL record using
>> XLogBeginInsert and XLogRegister* functions, and insert it. (See
>> "Constructing a WAL record" below). Then update the page's LSN using the
>> returned XLOG location. For instance,
>>
>> XLogBeginInsert();
>> XLogRegisterBuffer(...)
>> XLogRegisterData(...)
>> recptr = XLogInsert(rmgr_id, info);
>>
>> PageSetLSN(dp, recptr);
>>
>> 6. END_CRIT_SECTION()
>>
>> 7. Unlock and unpin the buffer(s).

There is quite important step in this instruction:

> Then update the page's LSN using the returned XLOG location.

This step is violated for the call of visibilitymap_clear.

Though, probably I'm mistaken this is source of the bug. But it is really
source of other kinds of issues.

--
regards
Yura Sokolov aka funny-falcon

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Yura Sokolov 2025-08-19 13:29:34 Re: VM corruption on standby
Previous Message Kirill Reshke 2025-08-19 13:17:44 Re: VM corruption on standby