| From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
|---|---|
| To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
| Cc: | Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, akorotkov(at)postgresql(dot)org, aekorotkov(at)gmail(dot)com, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Bug in amcheck? |
| Date: | 2025-11-26 08:20:40 |
| Message-ID: | abd65090-5336-42cc-b768-2bdd66738404@iki.fi |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 25/11/2025 22:51, Peter Geoghegan wrote:
> On Tue, Nov 25, 2025 at 3:03 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>> To fix this, I guess we need to teach bt_index_parent_check() about
>> half-dead pages. Anyone volunteer to write that patch?
>
> It's not like bt_index_parent_check doesn't generally know about them.
> For example, bt_downlink_missing_check goes to great lengths to
> distinguish between legitimate "missing" downlinks caused by an
> interrupted page deletion, and real missing downlinks caused by
> corruption.
>
> The problem we're seeing here seems likely limited to code added by
> commit d114cc53, which enhanced bt_index_parent_check by adding the
> new bt_child_highkey_check check. bt_child_highkey_check actually
> reuses bt_downlink_missing_check (which deals with half-dead pages
> correctly), but still isn't careful enough about half-dead pages. This
> is kind of surprising, since it *does* account for incomplete splits,
> which are similar.
+1. It took me a moment to understand bt_downlink_missing_check(). The
comments there talk about interrupted page deletions and incomplete
splits, but it's actually never called for half-dead pages. The caller
checks !P_IGNORE(opaque), and P_IGNORE means BTP_DELETED | BTP_HALF_DEAD.
I don't think BTP_DELETED pages should be reachable here at all. So
perhaps we should specifically check BTP_DELETED and give an ERROR on
those. And for clarity, perhaps move the check for BTP_HALF_DEAD into
bt_downlink_missing_check(), alongside the incomplete-split check, so
that both cases would be checked at the same place. Just for clarity,
it's not wrong as it is.
> In short, I think that we need to track something like
> BtreeCheckState.previncompletesplit, but for half-dead pages. And then
> actually use that within bt_child_highkey_check, to avoid spurious
> false-positive reports of corruption.
I think it's even simpler than that, and we can just do this:
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -2268,7 +2268,7 @@ bt_child_highkey_check(BtreeCheckState *state,
* If we visit page with high key, check that it is equal to the
* target key next to corresponding downlink.
*/
- if (!rightsplit && !P_RIGHTMOST(opaque))
+ if (!rightsplit && !P_RIGHTMOST(opaque) && !P_ISHALFDEAD(opaque))
{
BTPageOpaque topaque;
IndexTuple highkey;
Both BTP_HALF_DEAD and BTP_INCOMPLETE_SPLIT indicate that a downlink is
missing, but they work slightly differently. If a page is marked as
BTP_INCOMPLETE_SPLIT, it means that its right sibling has no downlink,
but if a page is marked as BTP_HALF_DEAD, it means that the page itself
has no downlink.
- Heikki
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2025-11-26 08:35:48 | Re: Proposal: Conflict log history table for Logical Replication |
| Previous Message | David Geier | 2025-11-26 08:15:35 | Re: Performance issues with parallelism and LIMIT |