| From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
|---|---|
| To: | SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: [PATCH] Release replication slot on error in SQL-callable slot functions |
| Date: | 2026-05-11 03:01:33 |
| Message-ID: | CAHGQGwFZaWj8DctXuhWQZwSqi631=NKzQJyDV4yqT1Qapt8MFQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Sun, May 10, 2026 at 5:45 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram(at)gmail(dot)com> wrote:
>
> Hi Hackers,
>
> SQL-callable replication slot functions acquire a slot (setting
> the process-global MyReplicationSlot) but can then ERROR before reaching
> ReplicationSlotRelease(). If such an error is caught by a PL/pgSQL
> EXCEPTION block (which uses a subtransaction), MyReplicationSlot remains
> set because there is no subtransaction-level cleanup hook for replication
> slots.
>
> Any subsequent slot operation in the same session then hits
> Assert(MyReplicationSlot == NULL) and crashes the backend on assert
> enabled builds. In release builds the stale MyReplicationSlot is silently overwritten,
> permanently orphaning the old slot as "active." The orphaned slot blocks any other
> session from acquiring it, vacuum and WAL deletion.
>
> Repro:
>
> SELECT pg_create_logical_replication_slot('adv_test', 'test_decoding');
>
> DO $$ BEGIN
> PERFORM pg_replication_slot_advance('adv_test', '0/1'::pg_lsn);
> EXCEPTION WHEN others THEN
> RAISE NOTICE 'caught: %', SQLERRM;
> END $$;
>
> SELECT count(*) FROM pg_logical_slot_get_changes('adv_test', NULL, NULL);
>
> 2026-05-09 19:45:06.619 UTC [1096805] STATEMENT: SELECT pg_create_logical_replication_slot('adv_test', 'test_decoding');
> TRAP: failed Assert("MyReplicationSlot == NULL"), File: "slot.c", Line: 638, PID: 1096805
>
>
> Attached a patch to address this by wrapping error-prone paths in PG_TRY/PG_CATCH blocks
> and call ReplicationSlotRelease().
Thanks for the report and the patch!
I think wrapping the slot-processing code with PG_TRY()/PG_CATCH() seems
a good direction for addressing the issue you reported.
+ PG_CATCH();
+ {
+ ReplicationSlotRelease();
When create_logical_replication_slot() is called with temporary = true,
the created logical replication slot has RS_TEMPORARY persistency. Such a slot
is not dropped by ReplicationSlotRelease(), whereas an RS_EPHEMERAL slot is
dropped via ReplicationSlotDropAcquired().
So even with the v1 patch, a temporary logical replication slot can remain
unexpectedly if pg_create_logical_replication_slot() throws an error.
In this case, should create_logical_replication_slot() explicitly drop the slot
with ReplicationSlotDropAcquired(), or temporarily change the slot persistency
to RS_EPHEMERAL before calling ReplicationSlotRelease()?
Does a newly created logical replication slot created by
pg_copy_logical_replication_slot() have the same issue?
+ PG_CATCH();
+ {
+ ReplicationSlotRelease();
Should ReplicationSlotRelease() be called only when MyReplicationSlot
is not NULL?
/* Acquire the slot so we "own" it */
ReplicationSlotAcquire(NameStr(*slotname), true, true);
- /* A slot whose restart_lsn has never been reserved cannot be advanced */
- if (!XLogRecPtrIsValid(MyReplicationSlot->data.restart_lsn))
- ereport(ERROR,
- (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("replication slot \"%s\" cannot be advanced",
- NameStr(*slotname)),
- errdetail("This slot has never previously reserved
WAL, or it has been invalidated.")));
+ PG_TRY();
+ {
+ /* A slot whose restart_lsn has never been reserved cannot be
advanced */
+ if (!XLogRecPtrIsValid(MyReplicationSlot->data.restart_lsn))
Shouldn't ReplicationSlotAcquire() also be moved inside the PG_TRY() block?
Because it can throw an error after setting MyReplicationSlot.
Regards,
--
Fujii Masao
| From | Date | Subject | |
|---|---|---|---|
| Next Message | jian he | 2026-05-11 03:13:15 | Re: COPY ON_CONFLICT TABLE; save duplicated record to another table. |
| Previous Message | Tatsuo Ishii | 2026-05-11 02:39:09 | Re: Proposal: tighten validation for legacy EUC encodings or document that accepted byte sequences may be unconvertible to UTF8 |