Re: recovering from "found xmin ... from before relfrozenxid ..."

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, MBeena Emerson <mbeena(dot)emerson(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: recovering from "found xmin ... from before relfrozenxid ..."
Date: 2020-09-21 18:21:43
Message-ID: 991280.1600712503@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sun, Sep 20, 2020 at 1:13 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> 2. As Amit suspected, there's an inconsistency between pruneheap.c's
>> rules for which tuples are removable and vacuum.c's rules for that.
>> This seems like a massive bug in its own right: what's the point of
>> pruneheap.c going to huge effort to decide whether it should keep a
>> tuple if vacuum will then kill it anyway? I do not understand why
>> whoever put in the GlobalVisState stuff only applied it in pruneheap.c
>> and not VACUUM proper.

> I am not sure I fully understand why you're contrasting pruneheap.c
> with vacuum here, because vacuum just does HOT pruning to remove dead
> tuples - maybe calling the relevant functions with different
> arguments, but it doesn't have its own independent logic for that.

Right, but what we end up with is that the very same tuple xmin and
xmax might result in pruning/deletion, or not, depending on whether
it's part of a HOT chain or not. That's at best pretty weird, and
at worst it means that corner-case bugs in other places are triggered
in only one of the two scenarios ... which is what we have here.

> The key point is that the freezing code isn't, or at least
> historically wasn't, very smart about dead tuples. For example, I
> think if you told it to freeze something that was dead it would just
> do it, which is obviously bad. And that's why Andres stuck those
> sanity checks in there. But it's still pretty fragile. I think perhaps
> the pruning code should be rewritten in such a way that it can be
> combined with the code that freezes and marks pages all-visible, so
> that there's not so much action at a distance, but such an endeavor is
> in itself pretty scary, and certainly not back-patchable.

Not sure. The pruning code is trying to serve two masters, that is
both VACUUM and on-the-fly cleanup during ordinary queries. If you
try to merge it with other tasks that VACUUM does, you're going to
have a mess for the second usage. I fear there's going to be pretty
strong conservation of cruft either way.

FWIW, weakening the sanity checks in heap_prepare_freeze_tuple is
*not* my preferred fix here. But it'll take some work in other
places to preserve them.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-09-21 18:22:26 Re: Improper use about DatumGetInt32
Previous Message Daniel Gustafsson 2020-09-21 18:18:42 Re: OpenSSL 3.0.0 compatibility