Adding further hardening to nbtree page deletion

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Adding further hardening to nbtree page deletion
Date: 2023-06-16 21:15:08
Message-ID: CAH2-Wz=dayg0vjs4+er84TS9ami=csdzjpuiCGbEw=idhwqhzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached patch adds additional hardening to nbtree page deletion. It
makes nbtree VACUUM tolerate a certain sort of cross-page
inconsistencies in the structure of an index (corruption). VACUUM can
press on, avoiding an eventual wraparound/xidStopLimit failure in
environments where nobody notices the problem for an extended period.

This is very similar to my recent commit 5abff197 (though it's even
closer to commit 5b861baa). Once again we're demoting an ERROR to a
LOG message, and pressing on with vacuuming. I propose that this patch
be backpatched all the way, too. The hardening added by the patch
seems equally well targeted and low risk. It's a parent/child
inconsistency, as opposed to a sibling inconsistency. Very familiar
stuff, overall.

I have seen an internal report of the ERROR causing issues for a
production instance, so this definitely can fail in the field on
modern Postgres versions. Though this particular inconsistency ("right
sibling is not next child...") has a long history. It has definitely
been spotted in the field several times over many years. This 2006
thread about problems with a Wisconsin courts database is one example
of that:

https://www.postgresql.org/message-id/flat/3355.1144873721%40sss.pgh.pa.us#b0a89b2d9e7f6a3c818fdf723b8fa29b

At the time the ERROR was a PANIC. A few years later (in 2010), it was
demoted to an ERROR (see commit 8fa30f90). And now I want to demote it
to a LOG -- which is much easier now that we have a robust approach to
page deletion (after 2014 commit efada2b8e9).

--
Peter Geoghegan

Attachment Content-Type Size
v1-0001-nbtree-VACUUM-cope-with-topparent-inconsistencies.patch application/octet-stream 3.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-06-16 21:16:28 Re: [PATCH] Missing dep on Catalog.pm in meson rules
Previous Message Tristan Partin 2023-06-16 20:56:38 Re: test_extensions: fix inconsistency between meson.build and Makefile