Re: [BUG] Panic due to incorrect missingContrecPtr after promotion

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: robertmhaas(at)gmail(dot)com, simseih(at)amazon(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
Date: 2022-06-22 01:41:51
Message-ID: YrJzXwDSa3XWzOKB@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 21, 2022 at 10:35:33AM +0900, Kyotaro Horiguchi wrote:
> At Mon, 20 Jun 2022 11:57:20 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in
>> says "don't keep trying to read more WAL, just promote RIGHT HERE?". I
>> think this logic would surely be incorrect in that case. It feels to
>> me like the right thing to do is to always keep track if WAL ends in
>> an incomplete record, and then when we promote, we write an aborted
>> contrecord record if WAL ended in an incomplete record, full stop.

Agreed. The state is maintained in the WAL reader as far as I
understand the logic behind it.

>> Now, we do need to keep in mind that, while in StandbyMode, we might
>> reach the end of WAL more than once, because we have a polling loop
>> that looks for more WAL. So it does not work to just set the values
>> once and then assume that's the whole truth forever. But why not
>> handle that by storing the abortedRecPtr and missingContrecPtr
>> unconditionally after every call to XLogPrefetcherReadRecord()? They
>> will go back and forth between XLogRecPtrIsInvalid and other values
>> many times but the last value should -- I think -- be however things
>> ended up at the point where we decided to stop reading WAL.
>
> Unfortunately it doesn't work because we read a record already known
> to be complete again at the end of recovery. It is the reason of
> "abortedRecPtr < xlogreader->EndRecPtr" in my PoC patch. Without it,
> abrotedRecPtr is erased when it is actually needed. I don't like that
> expedient-looking condition, but the strict condition for resetting
> abortedRecPtr is iff "we have read a complete record at the same or
> grater LSN ofabortedRecPtr"...
>
> Come to think of this, I noticed that we can get rid of the
> file-global abortedRecPtr by letting FinishWalRecovery copy
> abortedRecPtr *before* reading the last record. This also allows us to
> remove the "expedient" condition.

Interesting point, though I am not sure to get why this is safe
compared to the existing solution of setting
missingContrecPtr/abortedRecPtr while reading a set of records when we
look for the last LSN at the end of recovery.

>> I haven't really dived into this issue much so I might be totally off
>> base, but it just doesn't feel right to me that this should depend on
>> whether we're in standby mode. That seems at best incidentally related
>> to whether an aborted contrecord record is required.
>>
>> P.S. "aborted contrecord record" is really quite an awkward turn of phrase!
>
> Thats true! Always open for better phrasings:(

The term that would be appropriate here is continuation record?
contrecord is a bit confusing for French-speakers, actually, as
"contre" means "against" ;)

By the way, something that itches me with the proposed patch is that
we don't actually stress the second problem reported, which is that
the use of StandbyMode is incorrect. Isn't that a sign that we'd
better extend more the tests of 026_overwrite_contrecord.pl with more
end-of-recovery scenarios? Two things that immediately come to mind
are the use of recovery_target_lsn that would force a promotion in the
middle of a continuation record and cascading standbys to make sure
that these get the extra OVERWRITE_CONTRECORD record generated at the
end of recovery.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-06-22 01:56:26 Re: amcheck is using a wrong macro to check compressed-ness
Previous Message Michael Paquier 2022-06-22 01:04:05 Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?