HOT vs freezing issue causing "cannot freeze committed xmax"

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: HOT vs freezing issue causing "cannot freeze committed xmax"
Date: 2020-07-23 18:10:18
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


In a development branch of mine Thomas / the CF bot found a relatively
rare regression failures. That turned out to be because there was an
edge case in which heap_page_prune() was a bit more pessimistic than
lazy_scan_heap(). But I wonder if this isn't an issue more broadly:

The issue I am concerned about is lazy_scan_heap()'s logic for DEAD HOT
updated tuples:

* Ordinarily, DEAD tuples would have been removed by
* heap_page_prune(), but it's possible that the tuple
* state changed since heap_page_prune() looked. In
* particular an INSERT_IN_PROGRESS tuple could have
* changed to DEAD if the inserter aborted. So this
* cannot be considered an error condition.
* If the tuple is HOT-updated then it must only be
* removed by a prune operation; so we keep it just as if
* it were RECENTLY_DEAD. Also, if it's a heap-only
* tuple, we choose to keep it, because it'll be a lot
* cheaper to get rid of it in the next pruning pass than
* to treat it like an indexed tuple. Finally, if index
* cleanup is disabled, the second heap pass will not
* execute, and the tuple will not get removed, so we must
* treat it like any other dead tuple that we choose to
* keep.
* If this were to happen for a tuple that actually needed
* to be deleted, we'd be in trouble, because it'd
* possibly leave a tuple below the relation's xmin
* horizon alive. heap_prepare_freeze_tuple() is prepared
* to detect that case and abort the transaction,
* preventing corruption.
if (HeapTupleIsHotUpdated(&tuple) ||
HeapTupleIsHeapOnly(&tuple) ||
params->index_cleanup == VACOPT_TERNARY_DISABLED)
nkeep += 1;
tupgone = true; /* we can delete the tuple */
all_visible = false;

In the case the HOT logic triggers, we'll call
heap_prepare_freeze_tuple() even when the tuple is dead. Which then can
lead us to
if (TransactionIdPrecedes(xid, cutoff_xid))
* If we freeze xmax, make absolutely sure that it's not an XID
* that is important. (Note, a lock-only xmax can be removed
* independent of committedness, since a committed lock holder has
* released the lock).
if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) &&
errmsg_internal("cannot freeze committed xmax %u",
freeze_xmax = true;

(before those errors we'd just have unset xmax)

Now obviously the question is whether it's possible that
heap_page_prune() left alive anything that could be seen as DEAD for the
check in lazy_scan_heap(), and that additionally is older than the
cutoff passed to heap_prepare_freeze_tuple().

I'm not sure - it seems like it could be possible in some corner cases,
when transactions abort after the heap_page_prune() but before the
second HeapTupleSatisfiesVacuum().

But regardless of whether it's possible today, it seems extremely
fragile. ISTM we should at least have a bunch of additional error checks
in the HOT branch for HEAPTUPLE_DEAD.


Andres Freund


Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-07-23 19:02:18 Re: Making CASE error handling less surprising
Previous Message Andres Freund 2020-07-23 18:00:59 Re: 'with' regression tests fails rarely (and spuriously)