Re: Incorrect logic in XLogNeedsFlush()

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Incorrect logic in XLogNeedsFlush()
Date: 2025-09-04 03:38:33
Message-ID: CAFiTN-tY8YbES0E9+xdm4XqsP=K5G5H8m39er5CSm0gyNGEJUw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 30, 2025 at 4:16 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> Hi,
>
> If you call XLogNeedsFlush() immediately after calling XLogFlush() in
> FlushBuffer(), it can return true.
>
> With this diff:
>
> diff --git a/src/backend/storage/buffer/bufmgr.c
> b/src/backend/storage/buffer/bufmgr.c
> index 350cc0402aa..91c3fe99d6e 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -4342,7 +4342,11 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln,
> IOObject io_object,
> * skip the flush if the buffer isn't permanent.
> */
> if (buf_state & BM_PERMANENT)
> + {
> XLogFlush(recptr);
> + if(XLogNeedsFlush(recptr))
> + elog(ERROR, "should be flushed and isn't");
> + }
>
> recovery/015_promotion_pages errors out.
>
> During an end-of-recovery checkpoint, XLogNeedsFlush() should compare
> the provided LSN to the flush pointer and not the min recovery point.

That right, "end-of-recovery checkpoint" actually performs a flush
instead of updating the minRecoveryPoint so we should compare it with
flush pointer.

> This only errors out during an end-of-recovery checkpoint after a
> crash when the ControlFile->minRecoveryPoint has not been correctly
> updated. In this case, LocalMinRecoveryPoint is initialized to an
> incorrect value potentially causing XLogNeedsFlush() to incorrectly
> return true.
>
> This trivial diff makes the test pass:
>
> @@ -3115,7 +3125,7 @@ XLogNeedsFlush(XLogRecPtr record)
> * instead. So "needs flush" is taken to mean whether minRecoveryPoint
> * would need to be updated.
> */
> - if (RecoveryInProgress())
> + if (RecoveryInProgress() && !XLogInsertAllowed())
> {
>
> However, since all of this code is new to me, there are some remaining
> things I don't understand:
>
> Why is it okay for other processes than the startup process to
> initialize LocalMinRecoveryPoint from ControlFile->minRecoveryPoint
> during crash recovery?

During crash recovery we don't yet know the minRecoveryPoint until we
replay all the WALs and once it is done it will be updated in the
ControlFile at PerformRecoveryXLogAction()->CreateEndOfRecoveryRecord().
After this point it will be valid to use by any process including the
StartupProcess. So I don't think until ControlFile->minRecoveryPoint
is initialized it is safe to use for any process. In fact, before any
connections are allowed this has to be set either by
CreateEndOfRecoveryRecord() if recovering a primary from crash or by
ReachedEndOfBackup() if you are starting a standby from backup.

> Besides this question, I find the use of the global variable
> InRecovery as a proxy for RecoveryInProgress() + "I'm the startup
> process" to be confusing. It seems like UpdateMinRecoveryPoint() and
> XLogNeedsFlush() could just test MyBackendType == B_STARTUP and make
> it explicit.

IIUC InRecovery is slightly different than "RecoveryInProgress() +
"I'm the startup", in StartupXlog, we first set InRecovery to false
and then update the minRecoveryPoint in CreateEndOfRecoveryRecord()
and after that we clear RecoveryInProgress (i.e. set
XLogCtl->SharedRecoveryState = RECOVERY_STATE_DONE;).

--
Regards,
Dilip Kumar
Google

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-09-04 03:57:56 Re: Remove traces of long in dynahash.c
Previous Message Michael Paquier 2025-09-04 03:15:24 Re: Sequence Access Methods, round two