From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | simseih(at)amazon(dot)com |
Cc: | michael(at)paquier(dot)xyz, alvherre(at)alvh(dot)no-ip(dot)org, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [BUG] Panic due to incorrect missingContrecPtr after promotion |
Date: | 2022-08-08 04:06:54 |
Message-ID: | 20220808.130654.541433441863454305.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Fri, 5 Aug 2022 21:28:16 +0000, "Imseih (AWS), Sami" <simseih(at)amazon(dot)com> wrote in
> > Would you mind trying the second attached to abtain detailed log on
> > your testing environment? With the patch, the modified TAP test yields
> > the log lines like below.
>
> I applied the logging patch to 13.7 ( attached is the backport ) and repro'd the
> Issue.
>
> I stripped out the relevant parts of the file. Let me know if this is
> helpful.
Thank you very much!
> postgresql.log.2022-08-05-17:2022-08-05 17:18:51 UTC::@:[359]:LOG: ### [F] @0/10000000: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=0, SbyModeReq=1
> postgresql.log.2022-08-05-17:2022-08-05 17:22:21 UTC::@:[359]:LOG: ### [S] @0/10000060: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=1, SbyModeReq=1
The server seem to have started as a standby after crashing a
primary. Is it correct?
> postgresql.log.2022-08-05-18:2022-08-05 18:38:14 UTC::@:[359]:LOG: ### [F] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=1, SbyModeReq=1
> postgresql.log.2022-08-05-18:2022-08-05 18:38:14 UTC::@:[359]:LOG: ### [S] @6/B6CB27D0: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=0, SbyModeReq=1
Archive recovery ended here. The server should have promoted that
time.. Do you see some interesting log lines around this time?
> postgresql.log.2022-08-05-18:2022-08-05 18:50:13 UTC::@:[359]:LOG: ### [S] @6/B8000198: abort=(0/0)0/0, miss=(0/0)0/0, SbyMode=0, SbyModeReq=1
But, recovery continues in non-standby mode. I don't see how come it
behaves that way.
> postgresql.log.2022-08-05-18:2022-08-05 18:50:20 UTC::@:[359]:LOG: ### [A] @6/F3FFFF20: abort=(6/F3FFFF20)0/0, miss=(6/F4000000)0/0, SbyMode=0, SbyModeReq=1
> postgresql.log.2022-08-05-18:2022-08-05 18:50:20 UTC::@:[359]:LOG: ### [S] @6/F4000030: abort=(0/0)6/F3FFFF20, miss=(0/0)6/F4000000, SbyMode=1, SbyModeReq=1
Then archive recovery starts again.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Bharath Rupireddy | 2022-08-08 04:29:09 | Re: Generalize ereport_startup_progress infrastructure |
Previous Message | Amit Kapila | 2022-08-08 04:04:47 | Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns |