Re: Assertion failure in SnapBuildInitialSnapshot()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, kuroda(dot)hayato(at)fujitsu(dot)com, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Assertion failure in SnapBuildInitialSnapshot()
Date: 2022-11-15 01:25:31
Message-ID: 20221115012531.laz5fokt2gnbhto7@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-11-10 16:04:40 +0530, Amit Kapila wrote:
> I don't have any good ideas on how to proceed with this. Any thoughts
> on this would be helpful?

One thing worth doing might be to convert the assertion path into an elog(),
mentioning both xids (or add a framework for things like AssertLT(), but that
seems hard). With the concrete values we could make a better guess at what's
going wrong.

It'd probably not hurt to just perform this check independent of
USE_ASSERT_CHECKING - compared to the cost of creating a slot it's neglegible.

That'll obviously only help us whenever we re-encounter the issue, which will
likely be a while...

Have you tried reproducing the issue by running the test in a loop?

One thing I noticed just now is that we don't assert
builder->building_full_snapshot==true. I think we should? That didn't use to
be an option, but now it is... It doesn't look to me like that's the issue,
but ...

Hm, also, shouldn't the patch adding CRS_USE_SNAPSHOT have copied more of
SnapBuildExportSnapshot()? Why aren't the error checks for
SnapBuildExportSnapshot() needed? Why don't we need to set XactReadOnly? Which
transactions are we even in when we import the snapshot (cf.
SnapBuildExportSnapshot() doing a StartTransactionCommand()).

I'm also somewhat suspicious of calling RestoreTransactionSnapshot() with
source=MyProc. Looks like it works, but it'd be pretty easy to screw up, and
there's no comments in SetTransactionSnapshot() or
ProcArrayInstallRestoredXmin() warning that that might be the case.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anton A. Melnikov 2022-11-15 01:39:53 Re: [BUG] Logical replica crash if there was an error in a function.
Previous Message Richard Guo 2022-11-15 01:02:13 Re: A new strategy for pull-up correlated ANY_SUBLINK