Re: BTree index corruption (heap-tid-past-end, unexpected zero page, misplaced TID in posting list) recurring on high-churn tables, PG 18.3, data_checksums=on, no preceding crash

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: alessandro(at)regolini(dot)it
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BTree index corruption (heap-tid-past-end, unexpected zero page, misplaced TID in posting list) recurring on high-churn tables, PG 18.3, data_checksums=on, no preceding crash
Date: 2026-07-02 18:32:31
Message-ID: CAH2-Wzmocfze4XEk=XP6wCdo8kWbnJUsBoJW3RG-Jq=nP16bmQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Jul 2, 2026 at 2:25 PM Alessandro Regolini
<alessandro(at)regolini(dot)it> wrote:
> What we can provide on the next occurrence
> ------------------------------------------
> We run a periodic amcheck sweep, so we usually catch a fresh case before
> reindexing. Before REINDEX we can capture, for the affected block:
> - contrib/pageinspect: bt_page_items() of the index block and
> heap_page_items() / page_header() of the referenced heap block
> - full bt_index_check / bt_index_parent_check output
> - verify_heapam() of the underlying table
> Please tell us which dumps would be most useful to root-cause this (btree
> deduplication / VACUUM interaction is our current suspicion).

Those all seem useful, but I doubt that bt_index_parent_check is going
to add much over bt_index_check. However, you should be sure to run
bt_index_check with heapallindexed=true, which will verify agreement
between the index and the underlying table.

If you see a heapallindexed=true failure, getting page images for the
pointed-to heap page is useful (as well as the index page). I prefer a
raw dump of the page itself over a textual representation. See the
procedure here:

https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#contrib/pageinspect_page_dump

--
Peter Geoghegan

In response to

Browse pgsql-bugs by date

  From Date Subject
Previous Message Ayush Tiwari 2026-07-02 17:46:04 Re: Fw:Re: Fw: gbt_var_consistent in contrib/btree_gist/btree_utils_var.c has internal-node type confusion on the <> strategy, bypassing exclusion constraints