Re: HOT chain validation in verify_heapam()

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Subject: Re: HOT chain validation in verify_heapam()
Date: 2023-03-23 14:04:11
Message-ID: CA+TgmoYqPdY+_2NDABkkMVA4fya5_vbmnXcQc6ytMYNghVKfeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 22, 2023 at 4:45 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> At the very least there's missing verification that tids actually exists in the
> "Update chain validation" loop, leading to:
> TRAP: failed Assert("ItemIdHasStorage(itemId)"), File: "../../../../home/andres/src/postgresql/src/include/storage/bufpage.h", Line: 355, PID: 645093
>
> Which makes sense - the earlier loop adds t_ctid to the successor array, which
> we then query without checking if there still is such a tid on the page.

Ah, crap. If the /* Perform tuple checks loop */ finds a redirect line
pointer, it verifies that the target is between FirstOffsetnNumber and
maxoff before setting lp_valid[ctx.offnum] = true. But in the case
where it's a CTID link, the equivalent checks are missing. We could
fix that like this:

--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -543,7 +543,8 @@ verify_heapam(PG_FUNCTION_ARGS)
*/
nextblkno = ItemPointerGetBlockNumber(&(ctx.tuphdr)->t_ctid);
nextoffnum = ItemPointerGetOffsetNumber(&(ctx.tuphdr)->t_ctid);
- if (nextblkno == ctx.blkno && nextoffnum != ctx.offnum)
+ if (nextblkno == ctx.blkno && nextoffnum != ctx.offnum &&
+ nextoffnum >= FirstOffsetNumber && nextoffnum <= maxoff)
successor[ctx.offnum] = nextoffnum;
}

> I suspect we don't just need a !ItemIdIsUsed(), but also a check gainst the
> max offset on the page.

I don't see why we need an ItemIdIsUsed check any place where we don't
have one already. lp_valid[x] can't be true if the item x isn't used,
unless we're referencing off the initialized portion of the array,
which we shouldn't do.

> WRT these failures:
> non-heap-only update produced a heap-only tuple at offset 20
>
> I think that's largely a consequence of HeapTupleHeaderIsHotUpdated()'s
> definition:
> /*
> * Note that we stop considering a tuple HOT-updated as soon as it is known
> * aborted or the would-be updating transaction is known aborted. For best
> * efficiency, check tuple visibility before using this macro, so that the
> * INVALID bits will be as up to date as possible.
> */
> #define HeapTupleHeaderIsHotUpdated(tup) \
> ( \
> ((tup)->t_infomask2 & HEAP_HOT_UPDATED) != 0 && \
> ((tup)->t_infomask & HEAP_XMAX_INVALID) == 0 && \
> !HeapTupleHeaderXminInvalid(tup) \
> )

Yeah, it's not good that we're looking at the hint bit or the xmin
there -- it should just be checking the infomask2 bit and nothing
else, I think.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2023-03-23 14:38:14 Re: Refactor calculations to use instr_time
Previous Message Peter Eisentraut 2023-03-23 13:54:48 Re: Transparent column encryption