Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject: Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions
Date: 2026-01-30 01:22:16
Message-ID: 25034397-BBCD-4642-A86B-2811FC82DC64@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jan 26, 2026, at 10:51, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
> In a previous discussion [4], Dmitry Dolgov pointed out a test case that resulted in a DEADLOCK. I ran that test against v3. The test still fails, but I no longer observe a deadlock; instead, the server now crashes during partition attachment. I will investigate this further.

I tried to investigate the server crash yesterday, but I’m no longer able to reproduce it. From the record of the first crash I encountered, the call stack looked
like this:
```
TRAP: failed Assert("entry->data.lockmode == BUFFER_LOCK_UNLOCK"), File: "bufmgr.c", Line: 5908, PID: 47991
0 postgres 0x00000001013d9bb0 ExceptionalCondition + 216
1 postgres 0x0000000101129a80 BufferLockConditional + 88
2 postgres 0x0000000101129a04 ConditionalLockBuffer + 224
3 postgres 0x0000000100b8966c _bt_conditionallockbuf + 28
4 postgres 0x0000000100b88714 _bt_allocbuf + 128
5 postgres 0x0000000100b858d4 _bt_split + 1496
6 postgres 0x0000000100b82cec _bt_insertonpg + 1520
7 postgres 0x0000000100b81220 _bt_doinsert + 608
8 postgres 0x0000000100b9a008 btinsert + 120
9 postgres 0x0000000100b7a224 index_insert + 552
10 postgres 0x0000000100c5dd50 CatalogIndexInsert + 764
11 postgres 0x0000000100c5df60 CatalogTupleUpdate + 100
12 postgres 0x0000000100c7e608 ConstraintSetParentConstraint + 580
13 postgres 0x0000000100de6598 AttachPartitionEnsureIndexes + 1596
14 postgres 0x0000000100de5cac attachPartitionTable + 80
15 postgres 0x0000000100dd864c ATExecAttachPartition + 2520
16 postgres 0x0000000100dcb8e8 ATExecCmd + 4464
17 postgres 0x0000000100dc6054 ATRewriteCatalogs + 408
18 postgres 0x0000000100dbfa18 ATController + 256
19 postgres 0x0000000100dbf84c AlterTable + 96
20 postgres 0x00000001011a3508 ProcessUtilitySlow + 1704
21 postgres 0x00000001011a111c standard_ProcessUtility + 3504
22 postgres 0x00000001011a035c ProcessUtility + 360
23 postgres 0x000000010119fa10 PortalRunUtility + 216
24 postgres 0x000000010119eae0 PortalRunMulti + 688
25 postgres 0x000000010119e018 PortalRun + 788
26 postgres 0x0000000101198dcc exec_simple_query + 1380
27 postgres 0x0000000101197ee8 PostgresMain + 3244
28 postgres 0x000000010118f8d0 BackendInitialize + 0
29 postgres 0x0000000101061f3c postmaster_child_launch + 456
30 postgres 0x00000001010696c8 BackendStartup + 304
31 postgres 0x0000000101067564 ServerLoop + 372
32 postgres 0x0000000101066044 PostmasterMain + 6440
33 postgres 0x0000000100ee40a4 main + 924
34 dyld 0x000000019a36dd54 start + 7184
2026-01-26 09:52:41.240 CST [46845] LOG: client backend (PID 47991) was terminated by signal 6: Abort trap: 6
```

I noticed that the Assert in bufmgr.c was removed earlier today by commit 333f58637.

However, with the server crash no longer occurring, the DEADLOCK issue reappeared. After some investigation, I confirmed that the deadlock is not specific to this patch, I can consistently reproduce it with ATTACH PARTITION on the master branch. That suggests this is a more general problem.

I’ll start a new thread to follow up on the deadlock separately.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2026-01-30 02:00:29 Re: display hot standby state in psql prompt
Previous Message Tom Lane 2026-01-30 01:21:20 Re: Decoupling our alignment assumptions about int64 and double