Re: 17.8 standby crashes during WAL replay from 17.5 primary: "could not access status of transaction"

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Sebastian Webber <sebastian(at)swebber(dot)me>, pgsql-bugs(at)lists(dot)postgresql(dot)org, Andrey Borodin <amborodin(at)acm(dot)org>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Dmitry Yurichev <dsy(dot)075(at)yandex(dot)ru>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Ivan Bykov <i(dot)bykov(at)modernsys(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>
Subject: Re: 17.8 standby crashes during WAL replay from 17.5 primary: "could not access status of transaction"
Date: 2026-02-14 16:18:39
Message-ID: FC778C81-C310-4F9B-99EC-920EAF853F8C@yandex-team.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Ouch...

I remember this place. For some reason I thought endTruncOff is the end of offsets. That would make sense here... Now I see it's just a new oldest offset.

> On 14 Feb 2026, at 16:42, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> If we want to play it even more safe -- and I guess that's the right thing to do for backpatching -- we could set latest_page_number *temporarily* while we do the the truncation, and restore the old value afterwards.

As far as I can see, the only relevant usage of last_page_number is:
/*
* While we are holding the lock, make an important safety check: the
* current endpoint page must not be eligible for removal.
*/
if (ctl->PagePrecedes(shared->latest_page_number, cutoffPage))
{
LWLockRelease(shared->ControlLock);
ereport(LOG,
(errmsg("could not truncate directory \"%s\": apparent wraparound",
ctl->Dir)));
return;
}

Perhaps, we also can bump latest_page_number forward?

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrey Borodin 2026-02-14 17:41:43 Re: 17.8 standby crashes during WAL replay from 17.5 primary: "could not access status of transaction"
Previous Message Richard Guo 2026-02-14 13:44:48 Re: BUG #19405: Assertion in eval_windowaggregates() fails due to integer overflow