BTree index corruption (heap-tid-past-end, unexpected zero page, misplaced TID in posting list) recurring on high-churn tables, PG 18.3, data_checksums=on, no preceding crash

From: Alessandro Regolini <alessandro(at)regolini(dot)it>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: BTree index corruption (heap-tid-past-end, unexpected zero page, misplaced TID in posting list) recurring on high-churn tables, PG 18.3, data_checksums=on, no preceding crash
Date: 2026-07-01 08:12:13
Message-ID: CAFDALapqu1NLj-svUV-sPdLs-e0DMEiaoS-kTFXhUqtjqn7M=A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PostgreSQL version:
PostgreSQL 18.3 (Ubuntu 18.3-1.pgdg24.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0, 64-bit
OS: Ubuntu 24.04 (noble), running as a KVM/QEMU guest
data_checksums: on
shared_buffers: 128MB
btree deduplication: enabled (default)
Storage: ext4 (rw,relatime,errors=remount-ro) on a dedicated virtual disk
Relevant GUCs: fsync=on, full_page_writes=on, wal_level=replica,
autovacuum=on

*Summary*-------
Over ~4 weeks we hit recurring btree index corruption on multiple
independent
databases hosted on the same physical host (~80 PG 18 clusters, one DB per
cluster). Corruption concentrates exclusively on high-churn tables
(INSERT + DELETE + HOT update + frequent autovacuum: application cache and a
message-queue table). We observed three distinct signatures, all of them
btree-vs-heap desync:

1. ERROR: heap tid from index tuple (X,Y) points past end of heap page
line pointer array at offset NNN of block B in index "<idx>"
2. index "<idx>" contains unexpected zero page at block B
3. posting list contains misplaced TID in index "<idx>" (btree dedup)

Timeline (verbatim runtime + amcheck log lines, times CEST)
-----------------------------------------------------------

*DB "soave":* 2026-06-05 .. 2026-06-08 ERROR: heap tid from index tuple
points past end
of heap page line pointer array at offset 272 of block N in index
"django_webix_sender_messagerecipient_pkey"
-> emitted every 4h by a periodic query, then stopped on its own.
2026-06-30 09:35 / 09:46 bt_index_check(heapallindexed=>true) reported:
- index "qdc_magazzino_articolo_id_prev_a0195938"
contains unexpected zero page at block N
- index "django_webix_sender_messagerecipient_content_type_id_7552397e"
posting list contains misplaced TID

*DB "ermes":* 2026-06-24 .. 2026-06-30 ERROR: heap tid from index tuple
points past end
of heap page line pointer array at offset 248 of block N in index
"mpaapp_cache_content_type_id_object_i_f6ecfefa_uniq"
-> dozens of occurrences over the window.

Key facts
---------
- data_checksums=on and NO invalid-page / checksum-failure error was ever
logged. The corrupt pages therefore carry valid checksums but structurally
wrong content: they were written by PostgreSQL, this is not a torn write
or
a storage-layer bit rot detected by checksums.
- No crash, PANIC, OOM or immediate shutdown preceded any event. The
postmaster
was up continuously across each window. Host kernel journal shows zero
MCE /
ECC / OOM events in the last 90 days.
- REINDEX INDEX CONCURRENTLY fixed every case with zero data loss (the heap
was
always intact and fully readable), which points to an index-vs-heap desync
rather than heap damage.
- The affected indexes are the busiest tables (application cache and message
queue), constantly vacuumed. Corruption is intermittent and self-heals
after
VACUUM / REINDEX.
- The same signature family reproduced on independent databases with
identical
schema (same application deployed many times), which argues against random
hardware/memory corruption: a random flip would not preferentially target
the
same vacuum-heavy indexes with the same error text across separate
clusters.
- This is not a collation/glibc issue: the symptoms are physical structural
desync (heap-tid-past-end, zero page, misplaced TID), not wrong sort
order.
- fsync=on and full_page_writes=on, so the two most common self-inflicted
corruption causes are excluded.

*What we can provide on the next occurrence*
------------------------------------------
We run a periodic amcheck sweep, so we usually catch a fresh case before
reindexing. Before REINDEX we can capture, for the affected block:
- contrib/pageinspect: bt_page_items() of the index block and
heap_page_items() / page_header() of the referenced heap block
- full bt_index_check / bt_index_parent_check output
- verify_heapam() of the underlying table
Please tell us which dumps would be most useful to root-cause this (btree
deduplication / VACUUM interaction is our current suspicion).

# Alessandro Regolini

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kyotaro Horiguchi 2026-07-01 08:40:53 Re: BUG #19541: Postgresql failed to run 150+ tests on both debug and release configurations with VS 2026 on Windows
Previous Message Thomas Munro 2026-07-01 02:36:05 Re: BUG #19539: psql 17.6 on Windows 11 prints random garbage characters in Russian localized output of \d