Logical WAL sender unresponsive during decoding commit

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Logical WAL sender unresponsive during decoding commit
Date: 2022-08-16 03:57:54
Message-ID: B319ECD6-9A28-4CDF-A8F4-3591E0BF2369@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers!

Some time ago I've seen a hanging logical replication that was trying to send transaction commit after doing table pg_repack.
I understand that those things do not mix well. Yet walsender was ignoring pg_terminate_backend() and I think this worth fixing.
Can we add CHECK_FOR_INTERRUPTS(); somewhere in this backtrace? Full session is attaches as file.

#0 pfree (pointer=0x561850bbee40) at ./build/../src/backend/utils/mmgr/mcxt.c:1032
#1 0x00005617712530d6 in ReorderBufferReturnTupleBuf (tuple=<optimized out>, rb=<optimized out>) at ./build/../src/backend/replication/logical/reorderbuffer.c:469
#2 ReorderBufferReturnChange (rb=<optimized out>, change=0x561772456048) at ./build/../src/backend/replication/logical/reorderbuffer.c:398
#3 0x0000561771253da1 in ReorderBufferRestoreChanges (rb=rb(at)entry=0x561771c14e10, txn=0x561771c0b078, file=file(at)entry=0x561771c15168, segno=segno(at)entry=0x561771c15178) at ./build/../src/backend/replication/logical/reorderbuffer.c:2570
#4 0x00005617712553ba in ReorderBufferIterTXNNext (state=0x561771c15130, rb=0x561771c14e10) at ./build/../src/backend/replication/logical/reorderbuffer.c:1146
#5 ReorderBufferCommit (rb=0x561771c14e10, xid=xid(at)entry=2976347782, commit_lsn=79160378448744, end_lsn=<optimized out>, commit_time=commit_time(at)entry=686095734290578, origin_id=origin_id(at)entry=0, origin_lsn=0) at ./build/../src/backend/replication/logical/reorderbuffer.c:1523
#6 0x000056177124a30a in DecodeCommit (xid=2976347782, parsed=0x7ffc3cb4c240, buf=0x7ffc3cb4c400, ctx=0x561771b10850) at ./build/../src/backend/replication/logical/decode.c:640
#7 DecodeXactOp (ctx=0x561771b10850, buf=buf(at)entry=0x7ffc3cb4c400) at ./build/../src/backend/replication/logical/decode.c:248
#8 0x000056177124a6a9 in LogicalDecodingProcessRecord (ctx=0x561771b10850, record=0x561771b10ae8) at ./build/../src/backend/replication/logical/decode.c:117
#9 0x000056177125d1e5 in XLogSendLogical () at ./build/../src/backend/replication/walsender.c:2893
#10 0x000056177125f5f2 in WalSndLoop (send_data=send_data(at)entry=0x56177125d180 <XLogSendLogical>) at ./build/../src/backend/replication/walsender.c:2242
#11 0x0000561771260125 in StartLogicalReplication (cmd=<optimized out>) at ./build/../src/backend/replication/walsender.c:1179
#12 exec_replication_command (cmd_string=cmd_string(at)entry=0x561771abe590 "START_REPLICATION SLOT dttsjtaa66crdhbm015h LOGICAL 0/0 ( \"include-timestamp\" '1', \"include-types\" '1', \"include-xids\" '1', \"write-in-chunks\" '1', \"add-tables\" '/* sanitized */.claim_audit,public.__consu"...) at ./build/../src/backend/replication/walsender.c:1612
#13 0x00005617712b2334 in PostgresMain (argc=<optimized out>, argv=argv(at)entry=0x561771b2a438, dbname=<optimized out>, username=<optimized out>) at ./build/../src/backend/tcop/postgres.c:4267
#14 0x000056177123857c in BackendRun (port=0x561771b0d7a0, port=0x561771b0d7a0) at ./build/../src/backend/postmaster/postmaster.c:4484
#15 BackendStartup (port=0x561771b0d7a0) at ./build/../src/backend/postmaster/postmaster.c:4167
#16 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1725
#17 0x000056177123954b in PostmasterMain (argc=9, argv=0x561771ab70e0) at ./build/../src/backend/postmaster/postmaster.c:1398
#18 0x0000561770fae8b6 in main (argc=9, argv=0x561771ab70e0) at ./build/../src/backend/main/main.c:228

What do you think?

Thank you!

Best regards, Andrey Borodin.

Attachment Content-Type Size
check_for_interrupts.diff application/octet-stream 514 bytes
stuck_commit_replication.txt text/plain 9.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-08-16 04:14:27 Re: Cleaning up historical portability baggage
Previous Message Andres Freund 2022-08-16 03:20:51 Re: pg_upgrade test writes to source directory