|From:||Andres Freund <andres(at)anarazel(dot)de>|
|To:||Robert Haas <robertmhaas(at)gmail(dot)com>|
|Cc:||Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>|
|Subject:||Re: HOT chain validation in verify_heapam()|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On 2023-03-22 13:45:52 -0700, Andres Freund wrote:
> On 2023-03-22 09:19:18 -0400, Robert Haas wrote:
> > On Fri, Mar 17, 2023 at 8:31 AM Aleksander Alekseev
> > <aleksander(at)timescale(dot)com> wrote:
> > > The patch needed a rebase due to a4f23f9b. PFA v12.
> > I have committed this after tidying up a bunch of things in the test
> > case file that I found too difficult to understand -- or in some cases
> > just incorrect, like:
> As noticed by Andrew
> https://postgr.es/m/bfa5bd2b-c0e6-9d65-62ce-97f4766b1c42%40dunslane.net and
> then reproduced on HEAD by me, there's something not right here.
> At the very least there's missing verification that tids actually exists in the
> "Update chain validation" loop, leading to:
> TRAP: failed Assert("ItemIdHasStorage(itemId)"), File: "../../../../home/andres/src/postgresql/src/include/storage/bufpage.h", Line: 355, PID: 645093
> Which makes sense - the earlier loop adds t_ctid to the successor array, which
> we then query without checking if there still is such a tid on the page.
It's not quite so simple - I see now that the lp_valid check tries to prevent
that. However, it's not sufficient, because there is no guarantee that
lp_valid[nextoffnum] is initialized. Consider what happens if t_ctid of a heap
tuple points to beyond the end of the item array - lp_valid[nextoffnum] won't
Why are redirections now checked in two places? There already was a
ItemIdIsUsed() check in the "/* Perform tuple checks */" loop, but now there's
the ItemIdIsRedirected() check in the "Update chain validation." loop as well
- and the output of that is confusing, because it'll just mention the target
of the redirect.
I also think it's not quite right that some of checks inside if
(ItemIdIsRedirected()) continue in case of corruption, others don't. While
there's a later continue, that means the corrupt tuples get added to the
predecessor array. Similarly, in the non-redirect portion, the successor
array gets filled with corrupt tuples, which doesn't seem quite right to me.
A patch addressing some, but not all, of those is attached. With that I don't
see any crashes or false-positives anymore.
|Next Message||Andres Freund||2023-03-22 22:07:57||Re: HOT chain validation in verify_heapam()|
|Previous Message||David G. Johnston||2023-03-22 21:50:07||Re: Options to rowwise persist result of stable/immutable function with RECORD result|