From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | HOT vs freezing issue causing "cannot freeze committed xmax" |
Date: | 2020-07-23 18:10:18 |
Message-ID: | 20200723181018.neey2jd3u7rfrfrn@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
In a development branch of mine Thomas / the CF bot found a relatively
rare regression failures. That turned out to be because there was an
edge case in which heap_page_prune() was a bit more pessimistic than
lazy_scan_heap(). But I wonder if this isn't an issue more broadly:
The issue I am concerned about is lazy_scan_heap()'s logic for DEAD HOT
updated tuples:
/*
* Ordinarily, DEAD tuples would have been removed by
* heap_page_prune(), but it's possible that the tuple
* state changed since heap_page_prune() looked. In
* particular an INSERT_IN_PROGRESS tuple could have
* changed to DEAD if the inserter aborted. So this
* cannot be considered an error condition.
*
* If the tuple is HOT-updated then it must only be
* removed by a prune operation; so we keep it just as if
* it were RECENTLY_DEAD. Also, if it's a heap-only
* tuple, we choose to keep it, because it'll be a lot
* cheaper to get rid of it in the next pruning pass than
* to treat it like an indexed tuple. Finally, if index
* cleanup is disabled, the second heap pass will not
* execute, and the tuple will not get removed, so we must
* treat it like any other dead tuple that we choose to
* keep.
*
* If this were to happen for a tuple that actually needed
* to be deleted, we'd be in trouble, because it'd
* possibly leave a tuple below the relation's xmin
* horizon alive. heap_prepare_freeze_tuple() is prepared
* to detect that case and abort the transaction,
* preventing corruption.
*/
if (HeapTupleIsHotUpdated(&tuple) ||
HeapTupleIsHeapOnly(&tuple) ||
params->index_cleanup == VACOPT_TERNARY_DISABLED)
nkeep += 1;
else
tupgone = true; /* we can delete the tuple */
all_visible = false;
break;
In the case the HOT logic triggers, we'll call
heap_prepare_freeze_tuple() even when the tuple is dead. Which then can
lead us to
if (TransactionIdPrecedes(xid, cutoff_xid))
{
/*
* If we freeze xmax, make absolutely sure that it's not an XID
* that is important. (Note, a lock-only xmax can be removed
* independent of committedness, since a committed lock holder has
* released the lock).
*/
if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) &&
TransactionIdDidCommit(xid))
ereport(PANIC,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg_internal("cannot freeze committed xmax %u",
xid)));
freeze_xmax = true;
(before those errors we'd just have unset xmax)
Now obviously the question is whether it's possible that
heap_page_prune() left alive anything that could be seen as DEAD for the
check in lazy_scan_heap(), and that additionally is older than the
cutoff passed to heap_prepare_freeze_tuple().
I'm not sure - it seems like it could be possible in some corner cases,
when transactions abort after the heap_page_prune() but before the
second HeapTupleSatisfiesVacuum().
But regardless of whether it's possible today, it seems extremely
fragile. ISTM we should at least have a bunch of additional error checks
in the HOT branch for HEAPTUPLE_DEAD.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2020-07-23 19:02:18 | Re: Making CASE error handling less surprising |
Previous Message | Andres Freund | 2020-07-23 18:00:59 | Re: 'with' regression tests fails rarely (and spuriously) |