Re: pg16: invalid page/page verification failed

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg16: invalid page/page verification failed
Date: 2023-10-05 18:48:00
Message-ID: ZR8E4CH4fUr6L/zr@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 05, 2023 at 07:16:31PM +0200, Matthias van de Meent wrote:
> On Thu, 5 Oct 2023 at 18:48, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> >
> > On an instance running pg16.0:
> >
> > log_time | 2023-10-05 10:03:00.014-05
> > backend_type | autovacuum worker
> > left | page verification failed, calculated checksum 5074 but expected 5050
> > context | while scanning block 119 of relation "public.postgres_log_2023_10_05_0900"
> >
> > This is the only error I've seen so far, and for all I know there's a
> > issue on the storage behind the VM, or a cosmic ray hit. But I moved
> > the table out of the way and saved a copy of get_raw_page() in case
> > someone wants to ask about it.
> >
> > postgres=# SELECT * FROM heap_page_item_attrs(get_raw_page(801594131::regclass::text, 119), 801594131);
> > lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_attrs
> > 1 | 2304 | 1 | 16 | | | | | | | | | |
> > 2 | 8160 | 1 | 16 | | | | | | | | | |
> > 3 | 8144 | 1 | 16 | | | | | | | | | |
> > ...all the same except for lp_off...
> > 365 | 2352 | 1 | 16 | | | | | | | | | |
> > 366 | 2336 | 1 | 16 | | | | | | | | | |
> > 367 | 2320 | 1 | 16 | | | | | | | | | |
>
> That's not a HEAP page; it looks more like a btree page: lp_len is too
> short for heap (which starts at lp_len = 24), and there are too many
> line pointers for an 8KiB heap page. btree often has lp_len of 16: 8
> bytes indextuple header, one maxalign of data (e.g. int or bigint).
>
> So, assuming it's a block of a different relation kind, then it's also
> likely it was originally located elsewhere in that other relation,
> indeed causing the checksum failure. You can further validate this by
> looking at the page header's pd_special value - if it is 8176, that'd
> be another indicator for it being a btree.

Nice point.

postgres=# SET ignore_checksum_failure=on; SELECT * FROM generate_series(115,119) AS a, page_header(get_raw_page(801594131::regclass::text, a)) AS b;
WARNING: page verification failed, calculated checksum 5074 but expected 5050
a | lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
-----+--------------+----------+-------+-------+-------+---------+----------+---------+-----------
115 | B61/A9436C8 | -23759 | 4 | 92 | 336 | 8192 | 8192 | 4 | 0
116 | B61/A944FA0 | 3907 | 4 | 104 | 224 | 8192 | 8192 | 4 | 0
117 | B61/A946828 | -24448 | 4 | 76 | 264 | 8192 | 8192 | 4 | 0
118 | B61/A94CCE0 | 26915 | 4 | 28 | 6256 | 8192 | 8192 | 4 | 0
119 | B5C/9F30D1C8 | 5050 | 0 | 1492 | 2304 | 8176 | 8192 | 4 | 0

The table itself has a few btree indexes on text columns and a brin
index on log_timestamp, but not on the integers.

It sounds like it's what's expected at this point, but after I
"SET ignore_checksum_failure=on", and read the page in, vacuum kicked
off and then crashed (in heap_page_prune() if that half of the stack
trace can be trusted).

*** stack smashing detected ***: postgres: autovacuum worker postgres terminated

< 2023-10-05 12:35:30.764 CDT >LOG: server process (PID 30692) was terminated by signal 11: Segmentation fault
< 2023-10-05 12:35:30.764 CDT >DETAIL: Failed process was running: autovacuum: VACUUM ANALYZE public.BROKEN_postgres_log_2023_10_05_0900

I took the opportunity to fsck the FS, which showed no errors.

I was curious if the relfilenodes had gotten confused/corrupted/??
But this seems to indicate not; the problem is only one block.

postgres=# SELECT oid, relfilenode, oid=relfilenode, relname FROM pg_class WHERE oid BETWEEN 801550000 AND 801594199 ORDER BY 1;
oid | relfilenode | ?column? | relname
-----------+-------------+----------+-------------------------------------------------
801564542 | 801564542 | t | postgres_log_2023_10_05_0800
801564545 | 801564545 | t | pg_toast_801564542
801564546 | 801564546 | t | pg_toast_801564542_index
801564547 | 801564547 | t | postgres_log_2023_10_05_0800_log_time_idx
801564548 | 801564548 | t | postgres_log_2023_10_05_0800_error_severity_idx
801564549 | 801564549 | t | postgres_log_2023_10_05_0800_error_message_idx
801564550 | 801564550 | t | postgres_log_2023_10_05_0800_duration_idx
801564551 | 801564551 | t | postgres_log_2023_10_05_0800_tempfile_idx
801594131 | 801594131 | t | BROKEN_postgres_log_2023_10_05_0900
801594134 | 801594134 | t | pg_toast_801594131
801594135 | 801594135 | t | pg_toast_801594131_index
801594136 | 801594136 | t | postgres_log_2023_10_05_0900_log_time_idx
801594137 | 801594137 | t | postgres_log_2023_10_05_0900_error_severity_idx
801594138 | 801594138 | t | postgres_log_2023_10_05_0900_error_message_idx
801594139 | 801594139 | t | postgres_log_2023_10_05_0900_duration_idx
801594140 | 801594140 | t | postgres_log_2023_10_05_0900_tempfile_idx

Before anybody asks, we didn't retain WAL from this morning.

FYI, the storage is ext4/LVM/scsi (it looks like this didn't use
vmw_pvscsi but an emulated hardware driver).

/dev/mapper/data-postgres on /var/lib/pgsql type ext4 (rw,relatime,seclabel,data=ordered)
[ 0.000000] Linux version 3.10.0-1160.el7.x86_64 (mockbuild(at)kbuilder(dot)bsys(dot)centos(dot)org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Mon Oct 19 16:18:59 UTC 2020
[ 1.446380] scsi 2:0:1:0: Direct-Access VMware Virtual disk 1.0 PQ: 0 ANSI: 2
[ 1.470764] scsi target2:0:1: Beginning Domain Validation
[ 1.471077] scsi target2:0:1: Domain Validation skipping write tests
[ 1.471079] scsi target2:0:1: Ending Domain Validation
[ 1.471099] scsi target2:0:1: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 127)
[ 1.484109] sd 2:0:1:0: [sdb] 1048576000 512-byte logical blocks: (536 GB/500 GiB)
[ 1.484136] sd 2:0:1:0: [sdb] Write Protect is off
[ 1.484139] sd 2:0:1:0: [sdb] Mode Sense: 45 00 00 00
[ 1.484163] sd 2:0:1:0: [sdb] Write cache: disabled, read cache: disabled, doesn't support DPO or FUA
[ 1.485808] sd 2:0:1:0: [sdb] Attached SCSI disk
[ 4.271339] sd 2:0:1:0: Attached scsi generic sg1 type 0

--
Justin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-10-05 19:04:53 Re: [PoC/RFC] Multiple passwords, interval expirations
Previous Message Nathan Bossart 2023-10-05 18:39:15 Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag