Re: HOT vs freezing issue causing "cannot freeze committed xmax"

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HOT vs freezing issue causing "cannot freeze committed xmax"
Date: 2020-07-24 16:55:14
Message-ID: 20200724165514.dnu5hr4vvgkssf5p@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-07-24 11:06:58 -0400, Robert Haas wrote:
> On Thu, Jul 23, 2020 at 2:10 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > In the case the HOT logic triggers, we'll call
> > heap_prepare_freeze_tuple() even when the tuple is dead.
>
> I think this is very bad. I've always been confused about these
> comments, but I couldn't quite put my finger on the problem. Now I
> think I can: the comments here imagine that we have the option either
> to set tupgone, causing the line pointer to be marked unused by an
> eventual call to lazy_vacuum_page(), or that we can decline to set
> tupgone, which will leave the tuple around to be handled by the next
> vacuum.

Yea. I think the only saving grace is that it's not obvious when the
situation can arise without prior corruption. But even if that's actuall
impossible, it seems extremely fragile. I stared at heap_prune_chain()
for quite a while and couldn't convince myself either way.

> However, we don't really have a choice at all. A choice implies that
> either option is correct, and therefore we can elect the one we
> prefer. But here, it's not just that one option is incorrect, but that
> both options are incorrect. Setting tupgone controls whether or not
> the tuple is considered for freezing. If we DON'T consider freezing
> it, then we might manage to advance relfrozenxid while an older XID
> still exists in the table. If we DO consider freezing it, we will
> correctly conclude that it needs to be frozen, but then the freezing
> code is in an impossible situation, because it has no provision for
> getting rid of tuples, only for keeping them. Its logic assumes that
> whenever we are freezing xmin or xmax we do that in a way that causes
> the tuple to be visible to everyone, but this tuple should be visible
> to no one. So with your changes it now errors out instead of
> corrupting data, but that's just replacing one bad thing (data
> corruption) with another (VACUUM failures).

I suspect that the legitimate cases hitting this branch won't error out,
because then xmin/xmax aren't old enough to need to be frozen.

> I think the actual correct behavior here is to mark the line pointer
> as dead.

That's not trivial, because just doing that naively will break HOT.

> The easiest way to accomplish that is probably to have
> lazy_scan_heap() just emit an extra XLOG_HEAP2_CLEAN record beyond
> whatever HOT-pruning already did, if it's necessary. A better solution
> would probably be to merge HOT-pruning with setting things all-visible
> and have a single function that does both, but that seems a lot more
> invasive, and definitely unsuitable for back-patching.

I suspect that merging pruning and this logic in lazy_scan_heap() really
is the only proper way to solve this kind of issue.

I wonder if, given we don't know if this issue can be hit in a real
database, and given that it already triggers an error, the right way to
deal with this in the back-branches is to emit a more precise error
message. I.e. if we hit this branch, and either xmin/xmax are older than
the cutoff, then we issue a more specific ERROR.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-07-24 17:03:30 Re: Making CASE error handling less surprising
Previous Message Andres Freund 2020-07-24 16:49:13 Re: Making CASE error handling less surprising