Re: [BUG] Panic due to incorrect missingContrecPtr after promotion

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: simseih(at)amazon(dot)com
Cc: alvherre(at)alvh(dot)no-ip(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
Date: 2022-05-27 06:37:35
Message-ID: 20220527.153735.871115454014517938.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 27 May 2022 02:01:27 +0000, "Imseih (AWS), Sami" <simseih(at)amazon(dot)com> wrote in
> After further research, we found the following.
>
> Testing on 13.6 with the attached patch we see
> that the missingContrecPtr is being incorrectly
> set on the standby and the promote in the tap
> test fails.
>
> Per the comments in xlog.c, the
> missingContrecPtr should not be set when
> in standby mode.
>
> /*
> * When not in standby mode we find that WAL ends in an incomplete
> * record, keep track of that record. After recovery is done,
> * we'll write a record to indicate downstream WAL readers that
> * that portion is to be ignored.
> */
> if (!StandbyMode &&
> !XLogRecPtrIsInvalid(xlogreader->abortedRecPtr))
> {
> abortedRecPtr = xlogreader->abortedRecPtr;
> missingContrecPtr = xlogreader->missingContrecPtr;
> elog(LOG, "missingContrecPtr == %ld", missingContrecPtr);
> }
>
> If StandbyModeRequested is checked instead, which
> checks for the presence of a standby signal file,
> The missingContrecPtr is not set on the
> standby and the test succeeds.
>
> if (!StandbyModeRequested &&
> !XLogRecPtrIsInvalid(xlogreader->abortedRecPtr))
> {
> abortedRecPtr = xlogreader->abortedRecPtr;
> missingContrecPtr = xlogreader->missingContrecPtr;
> elog(LOG, "missingContrecPtr == %ld", missingContrecPtr);
> }
>
> If this is a bug as it appears, it appears the original patch
> to resolve this issue is not needed and the ideal fix
> Is to ensure that a standby does not set
> missingContrecPtr.
>
> Would like to see what others think.

Due to lack of information, I don't have a clear idea of what is
happening here. If that change "fixed" the "issue", that seems to mean
that crash recovery passed an immature contrecord (@6/A8000060) but
recovery did not stop at the time then continues until 10/B1FA3D88. A
possibility is, crash recovery ended at the immature contrecord then
the server moved to archive recovery mode and was able to continue
from the same (aborted) LSN on the archived WAL. This means 6/A8 in
pg_wal had been somehow removed after archived, or the segment 6/A7
(not A8) in both pg_xlog and archive are in different histories, which
seems to me what we wanted to prevent by the
XLOG_OVERWRITE_CONTRECORD. Didn't you use the same archive content
for repeated recovery testing?

And from the fact that entering crash recovery means the cluster
didn't have an idea of how far it should recover until consistency,
that is contained in backup label control file. That could happen
when a crashed primary is as-is reused as the new standby of the new
primary.

So.. I'd like to hear exactly what you did as the testing.

When standby mode is requested, if crash recovery fails with immature
contrecord, I think we shouldn't continue recovery. But I'm not sure
we need to explictly reject that case. Further study is needed..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tharakan, Robins 2022-05-27 07:03:52 Allow makeaclitem() to accept multiple privileges
Previous Message Yugo NAGATA 2022-05-27 06:30:28 Prevent writes on large objects in read-only transactions