Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Date: 2022-04-14 18:47:26
Message-ID: 1533665.1649962046@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2022-04-14 12:16:45 -0400, Robert Haas wrote:
>> I got curious and looked at the underlying problem here and I am
>> wondering whether HaveRegisteredOrActiveSnapshot() is just buggy. It
>> seems to me that the code is always going to return true if there are
>> any active snapshots, and the rest of the function is intended to test
>> whether there is a registered snapshot other than the catalog
>> snapshot. But I don't think that's what this code does:
>>
>> if (pairingheap_is_empty(&RegisteredSnapshots) ||
>> !pairingheap_is_singular(&RegisteredSnapshots))
>> return false;
>>
>> return CatalogSnapshot == NULL;

> Certainly looks off...

Yeah, this is broken. Whilst waiting around for a build on wrasse's
host, I reproduced the problem shown in this thread, and here's what
I see at the point of the exception:

(gdb) p RegisteredSnapshots
$5 = {ph_compare = 0x9a6000 <xmin_cmp>, ph_arg = 0x0,
ph_root = 0xec3168 <CatalogSnapshotData+72>}
(gdb) p *RegisteredSnapshots.ph_root
$6 = {first_child = 0x2d85d70, next_sibling = 0x0, prev_or_parent = 0x0}
(gdb) p CatalogSnapshotData
$7 = {snapshot_type = SNAPSHOT_MVCC, xmin = 52155, xmax = 52155,
xip = 0x2d855b0, xcnt = 0, subxip = 0x2de9130, subxcnt = 0,
suboverflowed = false, takenDuringRecovery = false, copied = false,
curcid = 0, speculativeToken = 0, vistest = 0x0, active_count = 0,
regd_count = 0, ph_node = {first_child = 0x2d85d70, next_sibling = 0x0,
prev_or_parent = 0x0}, whenTaken = 0, lsn = 0, snapXactCompletionCount = 1}
(gdb) p CatalogSnapshot
$8 = (Snapshot) 0xec3120 <CatalogSnapshotData>
(gdb) p *(Snapshot) (0x2d85d70-72)
$9 = {snapshot_type = SNAPSHOT_MVCC, xmin = 52155, xmax = 52155, xip = 0x0,
xcnt = 0, subxip = 0x0, subxcnt = 0, suboverflowed = false,
takenDuringRecovery = false, copied = true, curcid = 0,
speculativeToken = 0, vistest = 0x0, active_count = 0, regd_count = 2,
ph_node = {first_child = 0x0, next_sibling = 0x0,
prev_or_parent = 0xec3168 <CatalogSnapshotData+72>}, whenTaken = 0,
lsn = 0, snapXactCompletionCount = 0}
(gdb) p ActiveSnapshot
$10 = (ActiveSnapshotElt *) 0x0

So in fact there IS another registered snapshot, and
HaveRegisteredOrActiveSnapshot is just lying. I think the
correct test is more nearly what we have in
InvalidateCatalogSnapshotConditionally:

if (CatalogSnapshot &&
ActiveSnapshot == NULL &&
pairingheap_is_singular(&RegisteredSnapshots))
// then the CatalogSnapshot is the only one.

Ergo, this actually is a bug in 277692220.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-04-14 19:05:50 Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Previous Message Robert Haas 2022-04-14 18:19:08 Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)