Re: HOT chain validation in verify_heapam()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HOT chain validation in verify_heapam()
Date: 2022-11-14 22:33:07
Message-ID: 20221114223307.e6vz2hbzshbry5rg@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-11-14 14:13:10 -0800, Peter Geoghegan wrote:
> > I think the problem partially is that the proposed verify_heapam() code is too
> > "aggressive" considering things to be part of the same hot chain - which then
> > means we have to be very careful about erroring out.
> >
> > The attached isolationtester test triggers:
> > "unfrozen tuple was updated to produce a tuple at offset %u which is frozen"
> > "updated version at offset 3 is also the updated version of tuple at offset %u"
> >
> > Despite there afaict not being any corruption. Worth noting that this happens
> > regardless of hot/non-hot updates being used (uncomment s3ci to see).
>
> Why don't you think that there is corruption?

I looked at the state after the test and the complaint is bogus. It's caused
by the patch ignoring the cur->xmax == next->xmin condition if next->xmin is
FrozenTransactionId. The isolationtester test creates a situation where that
leads to verify_heapam() considering tuples to be part of the same chain even
though they aren't.

> Because I feel like I'm repeating myself more than I should, but: why isn't
> it as simple as "HOT chain traversal logic is broken by frozen xmin in the
> obvious way, therefore all bets are off"?

Because that's irrelevant for the testcase and a good number of my concerns.

> Maybe you're right about the proposed new functionality getting things wrong
> with your adversarial isolation test, but I seem to have missed the
> underlying argument. Are you just talking about regular update chains here,
> not HOT chains? Something else?

As I noted, it happens regardless of HOT being used or not. The tuples aren't
part of the same chain, but the patch treats them as if they were. The reason
the patch considers them to be part of the same chain is precisely the
FrozenTransactionId condition I was worried about. Just because t_ctid points
to a tuple on the same page and the next tuple has xmin ==
FrozenTransactionId, doesn't mean they're part of the same chain. Once you
encounter a tuple with a frozen xmin you simply cannot assume it's part of the
chain you've been following.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2022-11-14 22:41:54 meson oddities
Previous Message Peter Geoghegan 2022-11-14 22:13:10 Re: HOT chain validation in verify_heapam()