Re: HOT chain validation in verify_heapam()

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Subject: Re: HOT chain validation in verify_heapam()
Date: 2022-09-06 21:19:00
Message-ID: CA+TgmoancEq+Q-0gHveULix1Y_XFv+T5b5qLeiJeK9FDjW7u_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 6, 2022 at 6:34 AM Himanshu Upadhyaya
<upadhyaya(dot)himanshu(at)gmail(dot)com> wrote:
>> This isn't safe because the line pointer referenced by rditem may not
>> have been sanity-checked yet. Refer to the comment just below where it
>> says "Sanity-check the line pointer's offset and length values".
>>
> handled by creating a new function check_lp and calling it before accessing the redirected tuple.

I think this is going to result in duplicate error messages, because
if A redirects to B, what keeps us from calling check_lp(B) once when
we reach A and again when we reach B?

I am kind of generally suspicious of the idea that, both for redirects
and for ctid links, you just have it check_lp() on the target line
pointer and then maybe try to skip doing that again later when we get
there. That feels error-prone to me. I think we should try to find a
way of organizing the code where we do the check_lp() checks on all
line pointers in order without skipping around or backing up. It's not
100% clear to me what the best way of accomplishing that is, though.

But here's one random idea: add a successor[] array and an lp_valid[]
array. In the first loop, set lp_valid[offset] = true if it passes the
check_lp() checks, and set successor[A] = B if A redirects to B or has
a CTID link to B, without matching xmin/xmax. Then, in a second loop,
iterate over the successor[] array. If successor[A] = B && lp_valid[A]
&& lp_valid[B], then check whether A.xmax = B.xmin; if so, then
complain if predecessor[B] is already set, else set predecessor[B] =
A. Then, in the third loop, iterate over the predecessor array just as
you're doing now. Then it's clear that we do the lp_valid checks
exactly once for every offset that might need them, and in order. And
it's also clear that the predecessor-based checks can never happen
unless the lp_valid checks passed for both of the offsets involved.

> Done, reformatted using pg_indent.

Thanks, but the new check_lp() function's declaration is not formatted
according to pgindent guidelines. It's not enough to fix the problems
once, you have to avoid reintroducing them.

>> + if (!TransactionIdIsValid(curr_xmax) &&
>> + HeapTupleHeaderIsHotUpdated(curr_htup))
>> + {
>> + report_corruption(&ctx,
>> + psprintf("Current tuple at offset %u is
>> HOT but is last tuple in the HOT chain.",
>> + (unsigned) ctx.offnum));
>> + }
>>
>> This check has nothing to do with the predecessor[] array, so it seems
>> like it belongs in check_tuple() rather than here. Also, the message
>> is rather confused, because the test is checking whether the tuple has
>> been HOT-updated, while the message is talking about whether the tuple
>> was *itself* created by a HOT update. Also, when we're dealing with
>> corruption, statements like "is last tuple in the HOT chain" are
>> pretty ambiguous. Also, isn't this an issue for both HOT-updated
>> tuples and also just regular updated tuples? i.e. maybe what we should
>> be complaining about here is something like "tuple has been updated,
>> but xmax is 0" and then make the test check exactly that.
>
> Moved to check_tuple_header. This should be applicable for both HOT and normal updates but even the last updated tuple in the normal update is HEAP_UPDATED so not sure how we can apply this check for a normal update?

Oh, yeah. You're right. I was thinking that HEAP_UPDATED was like
HEAP_HOT_UPDATED, but it's not: HEAP_UPDATED gets set on the new
tuple, while HEAP_HOT_UPDATED gets set on the old tuple.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-09-06 21:57:04 Re: Switching XLog source from archive to streaming when primary available
Previous Message Reid Thompson 2022-09-06 21:10:49 Re: Add tracking of backend memory allocated to pg_stat_activity