logical replication of truncate command with trigger causes Assert

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: logical replication of truncate command with trigger causes Assert
Date: 2021-06-08 21:52:14
Message-ID: B4A3AF82-79ED-4F4C-A4E5-CD2622098972@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hackers,

On master, when a statement level trigger is fired for a replicated truncate command, the following stack trace is generated:

TRAP: FailedAssertion("portal != NULL", File: "pquery.c", Line: 1760, PID: 93854)
0 postgres 0x0000000108e269f2 ExceptionalCondition + 130
1 postgres 0x0000000108bef2f4 EnsurePortalSnapshotExists + 100
2 postgres 0x0000000108a93231 _SPI_execute_plan + 529
3 postgres 0x0000000108a93c0f SPI_execute_plan_with_paramlist + 127
4 plpgsql.so 0x00000001098bf9e5 exec_stmt_execsql + 277
5 plpgsql.so 0x00000001098bbaf6 exec_stmts + 294
6 plpgsql.so 0x00000001098bb367 exec_stmt_block + 1127
7 plpgsql.so 0x00000001098ba57a plpgsql_exec_trigger + 442
8 plpgsql.so 0x00000001098cb5b1 plpgsql_call_handler + 305
9 postgres 0x0000000108a3137c ExecCallTriggerFunc + 348
10 postgres 0x0000000108a3447d afterTriggerInvokeEvents + 1517
11 postgres 0x0000000108a33bb0 AfterTriggerEndQuery + 128
12 postgres 0x0000000108a1a9e2 ExecuteTruncateGuts + 2210
13 postgres 0x0000000108b83369 apply_dispatch + 3913
14 postgres 0x0000000108b82185 LogicalRepApplyLoop + 485
15 postgres 0x0000000108b81f87 ApplyWorkerMain + 1047
16 postgres 0x0000000108b474a2 StartBackgroundWorker + 386
17 postgres 0x0000000108b55cf6 maybe_start_bgworkers + 1254
18 postgres 0x0000000108b54510 sigusr1_handler + 464
19 libsystem_platform.dylib 0x00007fff69f3d5fd _sigtramp + 29
20 ??? 0x0000000000000000 0x0 + 0
21 postgres 0x0000000108b537ae PostmasterMain + 3726
22 postgres 0x0000000108aaa140 help + 0
23 libdyld.dylib 0x00007fff69d44cc9 start + 1
24 ??? 0x0000000000000004 0x0 + 4

I believe the issue was introduced in commit 84f5c2908da which added EnsurePortalSnapshotExists. That's not going to work in the case of logical replication, because there isn't an ActivePortal nor a snapshot.

Attached patch v1-0001 reliably reproduces the problem, though you have to Ctrl-C out of it, because the logical replication gets stuck in a loop after the Assert is triggered. You can see the stack trace by opening tmp_check/log/021_truncate_subscriber.log

Attachment Content-Type Size
v1-0001-Adding-test-to-trigger-logical-replication-assert.patch application/octet-stream 3.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-06-08 22:52:02 Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Previous Message Peter Geoghegan 2021-06-08 21:38:37 Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic