From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: Assertion failure in SnapBuildInitialSnapshot() |
Date: | 2023-01-30 11:27:19 |
Message-ID: | CAA4eK1KDFeh=ZbvSWPx=ir2QOXBxJbH0K8YqifDtG3xJENLR+w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jan 30, 2023 at 11:34 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> I have reproduced it manually. For this, I had to manually make the
> debugger call ReplicationSlotsComputeRequiredXmin(false) via path
> SnapBuildProcessRunningXacts()->LogicalIncreaseXminForSlot()->LogicalConfirmReceivedLocation()
> ->ReplicationSlotsComputeRequiredXmin(false) for the apply worker. The
> sequence of events is something like (a) the replication_slot_xmin for
> tablesync worker is overridden by apply worker as zero as explained in
> Sawada-San's email, (b) another transaction happened on the publisher
> that will increase the value of ShmemVariableCache->nextXid (c)
> tablesync worker invokes
> SnapBuildInitialSnapshot()->GetOldestSafeDecodingTransactionId() which
> will return an oldestSafeXid which is higher than snapshot's xmin.
> This happens because replication_slot_xmin has an InvalidTransactionId
> value and we won't consider replication_slot_catalog_xmin because
> catalogOnly flag is false and there is no other open running
> transaction. I think we should try to get a simplified test to
> reproduce this problem if possible.
>
Here are steps to reproduce it manually with the help of a debugger:
Session-1
==========
select pg_create_logical_replication_slot('s', 'test_decoding');
create table t2(c1 int);
select pg_replication_slot_advance('s', pg_current_wal_lsn()); --
Debug this statement. Stop before taking procarraylock in
ProcArraySetReplicationSlotXmin.
Session-2
============
psql -d postgres
Begin;
Session-3
===========
psql -d "dbname=postgres replication=database"
begin transaction isolation level repeatable read read only;
CREATE_REPLICATION_SLOT slot1 LOGICAL test_decoding USE_SNAPSHOT;
--Debug this statement. Stop in SnapBuildInitialSnapshot() before
taking procarraylock
Session-1
==========
Continue debugging and finish execution of
ProcArraySetReplicationSlotXmin. Verify
procArray->replication_slot_xmin is zero.
Session-2
=========
Select txid_current();
Commit;
Session-3
==========
Continue debugging.
Verify that safeXid follows snap->xmin. This leads to assertion (in
back branches) or error (in HEAD).
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2023-01-30 11:29:48 | Re: Assertion failure in SnapBuildInitialSnapshot() |
Previous Message | Amit Kapila | 2023-01-30 11:24:46 | Re: Assertion failure in SnapBuildInitialSnapshot() |