From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | "Imseih (AWS), Sami" <simseih(at)amazon(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [BUG] Panic due to incorrect missingContrecPtr after promotion |
Date: | 2022-02-22 20:16:41 |
Message-ID: | 202202222016.3nar64wc7xs7@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2022-Feb-22, Imseih (AWS), Sami wrote:
> On 13.5 a wal flush PANIC is encountered after a standby is promoted.
>
> With debugging, it was found that when a standby skips a missing
> continuation record on recovery, the missingContrecPtr is not
> invalidated after the record is skipped. Therefore, when the standby
> is promoted to a primary it writes an overwrite_contrecord with an LSN
> of the missingContrecPtr, which is now in the past. On flush time,
> this causes a PANIC. From what I can see, this failure scenario can
> only occur after a standby is promoted.
Ooh, nice find and diagnosys. I can confirm that the test fails as you
described without the code fix, and doesn't fail with it.
I attach the same patch, with the test file put in its final place
rather than as a patch. Due to recent xlog.c changes this need a bit of
work to apply to back branches; I'll see about getting it in all
branches soon.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"I'm impressed how quickly you are fixing this obscure issue. I came from
MS SQL and it would be hard for me to put into words how much of a better job
you all are doing on [PostgreSQL]."
Steve Midgley, http://archives.postgresql.org/pgsql-sql/2008-08/msg00000.php
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Fix-missing-continuation-record-after-standby-pro.patch | text/x-diff | 5.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2022-02-22 20:54:55 | Re: bailing out in tap tests nearly always a bad idea |
Previous Message | Tomas Vondra | 2022-02-22 20:12:15 | Re: postgres_fdw: using TABLESAMPLE to collect remote sample |