Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Date: 2022-04-14 16:16:45
Message-ID: CA+TgmoaLb0j9zEfkk6bmuKM5422JenzzcJLJSiPLgwfMd+xnBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 14, 2022 at 11:46 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I don't think Andres had any intention of ignoring it indefinitely.
> What he is doing is prioritizing it lower than several of the other
> open items, an opinion I fully agree with.

Well, my point is that, historically, relegating things to the older
bugs section often means nothing happens, which I think would not be
great in this case. However, I'll try to shut up about the procedural
issues for the moment, since I seem to be in the minority.

I got curious and looked at the underlying problem here and I am
wondering whether HaveRegisteredOrActiveSnapshot() is just buggy. It
seems to me that the code is always going to return true if there are
any active snapshots, and the rest of the function is intended to test
whether there is a registered snapshot other than the catalog
snapshot. But I don't think that's what this code does:

if (pairingheap_is_empty(&RegisteredSnapshots) ||
!pairingheap_is_singular(&RegisteredSnapshots))
return false;

return CatalogSnapshot == NULL;

So, if there are 0 active snapshots we return false. And if there is 1
active snapshot then we return true if there's no catalog snapshot and
otherwise false. So far so good. But if there are 2 or more registered
snapshots then the pairing heap will be neither empty nor singular so
it seems to me we will return false, which seems to me to be the wrong
answer. I tried this:

diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index a0be0c411a..4e8e26a362 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -1646,9 +1646,10 @@ HaveRegisteredOrActiveSnapshot(void)
* removed at any time due to invalidation processing. If explicitly
* registered more than one snapshot has to be in RegisteredSnapshots.
*/
- if (pairingheap_is_empty(&RegisteredSnapshots) ||
- !pairingheap_is_singular(&RegisteredSnapshots))
+ if (pairingheap_is_empty(&RegisteredSnapshots))
return false;
+ if (!pairingheap_is_singular(&RegisteredSnapshots))
+ return true;

return CatalogSnapshot == NULL;
}

I find that 'make check-world' passes with this change, which is
disturbing, because it also passes without this change. That means we
don't have any tests that reach HaveRegisteredOrActiveSnapshot() with
more than one registered snapshot.

Also, unless we have plans to use HaveRegisteredOrActiveSnapshot() in
more places, I think we should consider revising the whole approach
here. The way init_toast_snapshot() is coded, we basically have some
code that tests for active and registered snapshots and finds the
oldest one. We error out if there isn't one. And then we get to this
assertion, which checks the same stuff a second time but with an
additional check to see whether we ended up with the catalog snapshot.
Wouldn't it make more sense if GetOldestSnapshot() just refused to
return the catalog snapshot in the first place?

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-04-14 16:18:50 Re: Intermittent buildfarm failures on wrasse
Previous Message Nathan Bossart 2022-04-14 16:13:15 Re: allow specifying action when standby encounters incompatible parameter settings