Re: XLogReadRecord() error in XlogReadTwoPhaseData()

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgbf(at)twiska(dot)com, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogReadRecord() error in XlogReadTwoPhaseData()
Date: 2021-11-20 05:55:16
Message-ID: 3135410.1637387716@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> snapper just exhibited the same failure, too:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2021-11-18%2016%3A09%3A49

I grepped the buildfarm logs for all recent (last 3 months) occurrences of
'could not read two-phase state'. Here's the results:

sysname | branch | snapshot | stage | l
-----------+---------------+---------------------+-------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
kittiwake | REL9_6_STABLE | 2021-10-24 12:01:10 | pgbenchCheck | # 'client 1 aborted in state 3: ERROR: could not read two-phase state from xlog at 0/158F4E0
kittiwake | REL_13_STABLE | 2021-10-26 12:51:11 | ContribCheck-en_US.utf8 | # 'pgbench: error: client 3 script 1 aborted in command 4 query 0: ERROR: could not read two-phase state from WAL at 0/168C8D8
kittiwake | REL_14_STABLE | 2021-11-08 15:42:35 | ContribCheck-en_US.utf8 | # 'pgbench: error: client 0 script 0 aborted in command 3 query 0: ERROR: could not read two-phase state from WAL at 0/17ABF48
kittiwake | REL_13_STABLE | 2021-11-16 15:00:52 | ContribCheck-en_US.utf8 | # 'pgbench: error: client 3 script 1 aborted in command 4 query 0: ERROR: could not read two-phase state from WAL at 0/1668020: incorrect resource manager data checksum in record at 0/1668020
snapper | REL_14_STABLE | 2021-11-18 16:09:49 | contrib-amcheckCheck | # 'pgbench: error: client 3 script 1 aborted in command 4 query 0: ERROR: could not read two-phase state from WAL at 0/1770328: unexpected pageaddr 0/0 in log segment 000000010000000000000001, offset 7798784
tadarida | REL_11_STABLE | 2021-11-11 13:29:58 | pgbenchCheck | # 'client 3 aborted in command 3 (SQL) of script 0; ERROR: could not read two-phase state from WAL at 0/1716C68
tadarida | REL_10_STABLE | 2021-11-12 13:01:15 | pgbenchCheck | # 'client 4 aborted in command 3 of script 0; ERROR: could not read two-phase state from WAL at 0/16F1850: invalid record length at 0/16F1850: wanted 24, got 0
tadarida | HEAD | 2021-11-17 13:01:24 | contrib-amcheckCheck | # 'pgbench: error: client 0 script 1 aborted in command 4 query 0: ERROR: could not read two-phase state from WAL at 0/159EF88: unexpected pageaddr 0/0 in log segment 000000010000000000000001, offset 5890048

So not all are exactly 'unexpected pageaddr 0/0', but they do all
look like we read garbage data.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-11-20 06:16:35 Re: pg_waldump stucks with options --follow or -f and --stats or -z
Previous Message Michael Paquier 2021-11-20 05:50:40 Re: Should rename "startup process" to something else?