Error "initial slot snapshot too large" in create replication slot

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Error "initial slot snapshot too large" in create replication slot
Date: 2021-10-11 06:19:41
Message-ID: CAFiTN-tqopqpfS6HHug2nnOGieJJ_nm-Nvy0WBZ=Zpo-LqtSJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

While creating an "export snapshot" I don't see any protection why the
number of xids in the snapshot can not cross the
"GetMaxSnapshotXidCount()"?.

Basically, while converting the HISTORIC snapshot to the MVCC snapshot
in "SnapBuildInitialSnapshot()", we add all the xids between
snap->xmin to snap->xmax to the MVCC snap->xip array (xids for which
commit were not recorded). The problem is that we add both topxids as
well as the subxids into the same array and expect that the "xid"
count does not cross the "GetMaxSnapshotXidCount()". So it seems like
an issue but I am not sure what is the fix for this, some options are
a) Don't limit the xid count in the exported snapshot and dynamically
resize the array b) Increase the limit to GetMaxSnapshotXidCount() +
GetMaxSnapshotSubxidCount(). But in option b) there would still be a
problem that how do we handle the overflowed subtransaction?

I have locally, reproduced the issue,

1. Configuration
max_connections= 5
autovacuum = off
max_worker_processes = 0

2.Then from pgbench I have run the attached script (test.sql) from 5 clients.
./pgbench -i postgres
./pgbench -c4 -j4 -T 3000 -f test1.sql -P1 postgres

3. Concurrently, create replication slot,
[dilipkumar(at)localhost bin]$ ./psql "dbname=postgres replication=database"
postgres[7367]=#
postgres[6463]=# CREATE_REPLICATION_SLOT "slot" LOGICAL "test_decoding";
ERROR: 40001: initial slot snapshot too large
LOCATION: SnapBuildInitialSnapshot, snapbuild.c:597
postgres[6463]=# CREATE_REPLICATION_SLOT "slot" LOGICAL "test_decoding";
ERROR: XX000: clearing exported snapshot in wrong transaction state
LOCATION: SnapBuildClearExportedSnapshot, snapbuild.c:690

I could reproduce this issue, at least once in 8-10 attempts of
creating the replication slot.

Note: After that issue, I have noticed one more issue "clearing
exported snapshot in wrong transaction state", that is because the
"ExportInProgress" is not cleared on the transaction abort, for this,
a simple fix is we can clear this state on the transaction abort,
maybe I will raise this as a separate issue?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
test.sql application/sql 1.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-10-11 06:24:46 Re: Inconsistency in startup process's MyBackendId and procsignal array registration with ProcSignalInit()
Previous Message vignesh C 2021-10-11 06:16:58 Re: Added schema level support for publication.