Re: Incorrect logic in XLogNeedsFlush()

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Incorrect logic in XLogNeedsFlush()
Date: 2025-10-09 16:05:08
Message-ID: CAAKRu_YrDdUfJms6yzVZV+vaQ3yqgSCOWDcMSSDr6_+TG5wCqQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 25, 2025 at 3:53 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Sep 24, 2025 at 05:33:08PM -0400, Melanie Plageman wrote:
>
> > And, if I'm not mistaken, there was another idea mentioned in [1]
> > which was to move to using GetRecoveryState() in XLogNeedsFlush() and
> > UpdateMinRecoveryPoint instead of relying on the variable InRecovery +
> > XLogRecPtrIsInvalid(LocalMinRecoveryPoint) to distinguish crash
> > recovery.
>
> The removal of InRecovery means that we would disable a bit more
> aggressively updateMinRecoveryPoint for other processes than the
> startup process as InRecovery is only set in the startup process,
> which should be OK. Some comments of XLogNeedsFlush(), like the
> comment block on top of the new GetRecoveryState() call, would need a
> refresh.

One thing I still don't understand is how catching up to
min_recovery_point works when it is not in pg_wal. When the standby
has to stream from the primary or fetch records from the archive to
reach min_recovery point -- like if the value comes from a backup
label -- then it seems like it is wrong to updateMinRecoveryPoint to
true in SwitchIntoArchiveRecovery().

But ReadRecord() says "we have now replayed all the valid WAL in
pg_wal, so we are presumably now consistent"
and then it calls SwitchIntoArchiveRecovery(), which sets
updateMinRecoveryPoint to true.

> XLogNeedsFlush() has a bit further down your change a path where
> updateMinRecoveryPoint is set to false for other processes than the
> startup process. A shortcut based on GetRecoveryState() should make
> its removal possible, simplifying a bit more the code.

Going back to my earlier question about why processes other than the
startup process are okay to read the ControlFile->minRecoveryPoint
during crash recovery, Dilip said (and you later confirmed) WRT
bgwriter:

> Until the startup process hasn't completed the crash recovery i.g. not
> applied the WAL unto 'ControlFile->minRecoveryPoint' logically
> 'bgwriter' can not write any buffer whose WAL location is beyond the
> ControlFile->minRecoveryPoint because we haven't yet applied those
> WALs and also there should not any RestartPoint()/Checkpoint record
> before reaching the ControlFile->minRecoveryPoint otherwise we would
> have started crash recovery from that point.
>
> During the crash recovery process, before the WAL has been fully
> applied up to the ControlFile->minRecoveryPoint, the bgwriter cannot
> write any buffers that have a page LSN beyond this min recovery point.
> This is because those WALs which are beyond min recovery point have
> not yet been applied, so any buffer can not have page_lsn beyond this
> point.

But then why do we bother with doing the below in
UpdateMinRecoveryPoint() and XLogNeedsFlush():

/*
* Check minRecoveryPoint for any other process than the startup
* process doing crash recovery, which should not update the control
* file value if crash recovery is still running.
*/
if (XLogRecPtrIsInvalid(LocalMinRecoveryPoint))
updateMinRecoveryPoint = false;

after reading from the ControlFile. All processes read the ControlFile
on startup, so bgwriter would have something valid (if irrelevant) in
LocalMinRecoveryPoint after reading the control file above. So, when
would this code actually execute?

For it to be okay to replace this:

if (XLogRecPtrIsInvalid(LocalMinRecoveryPoint) && InRecovery)
{
updateMinRecoveryPoint = false;
return false;
}
LocalMinRecoveryPoint = ControlFile->minRecoveryPoint;
LocalMinRecoveryPointTLI = ControlFile->minRecoveryPointTLI;
if (XLogRecPtrIsInvalid(LocalMinRecoveryPoint))
updateMinRecoveryPoint = false;

with

if (GetRecoveryState() == RECOVERY_STATE_CRASH)
{
updateMinRecoveryPoint = false;
return false;
}
LocalMinRecoveryPoint = ControlFile->minRecoveryPoint;
LocalMinRecoveryPointTLI = ControlFile->minRecoveryPointTLI;

It seems like the second check of LocalMinRecoveryPoint after reading
from the control file would never have been true.

- Melanie

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2025-10-09 16:07:53 Re: Support getrandom() for pg_strong_random() source
Previous Message Tom Lane 2025-10-09 16:04:42 Re: Adding some error context for lock wait failures