| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
| Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org> |
| Subject: | Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions |
| Date: | 2025-12-29 08:24:55 |
| Message-ID: | 59FB38EF-FA62-41B7-A082-DDA251B04F9E@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Dec 22, 2025, at 14:26, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Dec 19, 2025 at 11:14 AM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>>
>> On Thursday, December 18, 2025 12:21 PM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>>
>>>> On Dec 17, 2025, at 16:48, Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
>>> wrote:
>>>>
>>>> On Wednesday, December 17, 2025 3:56 PM Chao Li
>>> <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>>>> Thank you both for all your advice. Here comes my first
>>>>> implementation of INHERIT in the attached v2 patch.
>>>>>
>>>>> On Wed, Dec 17, 2025 at 8:11 AM Euler Taveira
>>> <mailto:euler(at)eulerto(dot)com> wrote:
>>>>>
>>>>>> I wondering if we use INHERIT as default. The main advantage is
>>>>>> usability as Chao Li already mentioned. Is there any cases that
>>>>>> having a different replica identity from parent/partitioned table makes
>>> sense?
>>>>>
>>>>> We can leave this topic open for discussion. In my current
>>>>> implementation, NO INHERIT is still the default. But if we decide to
>>>>> switch the default, I can add a new commit that should include only 1
>>>>> line code change in gram.y and a tiny doc update.
>>>>>
>>>>> 0001 - when a new partition is created, use the parent's replication
>>>>> identity
>>>>> 0002 - add INHERIT | NO INHERIT
>>>>
>>>
>>> Hi Zhijie,
>>>
>>> Thanks for your feedback and linked information. I think this patch is avoiding
>>> the hard problem of “index” RI.
>>>
>>>>
>>>> I think there are several design considerations for this proposal:
>>>>
>>>> 1) Since the index names can vary across different partitions, what
>>>> should be the expected behavior if a new partition cannot identify the
>>>> same replica identity key as the root partitioned table?
>>>
>>> Index RI is skipped in this patch. INHERT only works for DEFAULT, FULL and
>>> NONE.
>>>
>>
>> I personally find skipping this part to be inconsistent, particularly given the
>> existing disparities among ALTER TABLE commands related to partitioned table handling.
>> Omitting this part introduces further inconsistency within the ALTER TABLE
>> REPLICA IDENTITY.
>>
>
> Fair point. I think one should summarize the previous discussions with
> key differences and where the previous patch got stuck. Then, it would
> be good to get some feedback from the people involved previously. If
> there is an agreement that we can do INHERIT stuff for specific parts
> then fine, otherwise, I think we need to address the index part as
> well.
> [1] https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql
> [2] https://www.postgresql.org/message-id/flat/OSBPR01MB2982A2738F16722899A50082FECB0%40OSBPR01MB2982.jpnprd01.prod.outlook.com#2e5388a7cde3c10430f8668a6c381b06
Sure, let me try to summarize the two discussions.
For [1], key participants included Álvaro Herrera (patch author), Dmitry Dolgov (reviewer/tester), Robert Haas, Simon Riggs, Michael Paquier, and Peter Eisentraut.
----
Brief summary with rough timeline:
* In 2019, Álvaro proposed a patch that essentially did the same thing as this patch. That patch also attempted to handle “index-based” replica identity, which my patch intentionally avoids.
* Dmitry tested the patch and reported issues related to attaching partitions, including index-related errors.
* Robert then pointed out that having REPLICA IDENTITY recurse was inconsistent with the behavior of other ALTER TABLE actions such as TABLESPACE, and emphasized that we should really try to think through *all* actions that might recurse to child partitions.
* Álvaro noted the use of `ALTER TABLE ONLY`, suggesting that WITHOUT ONLY could recurse to children, while ONLY would affect just the parent.
* Simon commented that recursing TABLESPACE changes could be problematic because they imply physical data rewrites.
* Robert listed several ALTER TABLE actions that lacked consistent recurse behavior (identity columns, triggers, CLUSTER, OWNER, TABLESPACE, CHECK constraints).
* This led to broader discussion about whether TABLESPACE/OWNER/etc. should recurse.
* Michael echoed support for having REPLICA IDENTITY recurse, controlled via ONLY.
* Peter pointed out that recursing ADD GENERATED AS IDENTITY / DROP IDENTITY may not be feasible.
* Álvaro wanted to proceed with the patch.
* Robert maintained that defining consistent semantics for all relevant ALTER TABLE actions should come first.
Overall, the blocker was an unresolved semantic disagreement, rather than a concrete technical objection to the patch itself. There appeared to be broad agreement that:
* New partitions should use the parent’s replica identity.
* ONLY could be used to control whether replica identity changes recurse.
However, the discussion about “semantic consistency” significantly broadened the scope, and there was no clear agreement on whether TABLESPACE/OWNER/etc. should recurse, which ultimately stalled the effort.
For [2], key participants included: Takayuki Tsunakawa (patch author), Bharath Rupireddy (reviewer), Álvaro Herrera, Michael Paquier, and Kyotaro Horiguchi.
----
Brief summary with rough timeline:
* In 2020, Takayuki proposed a patch intended to propagate ALTER TABLE SET LOGGED / UNLOGGED to partitions.
* Bharath reviewed the patch and raised a number of questions and edge cases.
* There were initial discussions about the patch mechanics and expected behavior.
* Álvaro then pointed out the strong relation to the earlier discussion in [1].
* The focus of the discussion shifted to the more fundamental question of “what is the parent?”:
* Takayuki viewed ALTER TABLE on a partitioned table as a convenient batch operation on existing partitions, with future partitions remaining independent.
* Álvaro, Michael, and Kyotaro argued that changing a property on the parent should define the default for future partitions as well.
* No clear agreement was reached on this semantic question, and the discussion expanded into broader concerns about consistency across ALTER TABLE actions.
* Takayuki withdrew the patch
Overall, [2] also fell under the umbrella of “semantic consistency”, the main discussion was not about replica identity itself.
Current situation:
----
* Resolving “semantic consistency” across ALTER TABLE actions in a single release appears to be the biggest challenge. However, addressing everything in one patch set does not seem realistic.
* The core question of “what is the parent?” from [1] remains central. That said, the discussion appeared to lean toward the view that future partitions should inherit properties from the parent.
* Different properties behave very differently. For example, propagating REPLICA IDENTITY is a metadata-only change and relatively safe, while propagating TABLESPACE implies physical data movement and is much riskier. As a result, each ALTER TABLE action may deserve its own discussion and patch set.
* The inconsistency is not limited to ALTER TABLE but also exists in CREATE TABLE behavior. For example, a new partition inherits TABLESPACE from the parent, but not REPLICA IDENTITY.
* “ALTER TABLE ONLY table_name” is commonly suggested as the mechanism to control whether changes should recurse to partitions.
How to proceed?
----
If we stop here, these inconsistencies will remain indefinitely, which I believe nobody really wants. With that in mind, I’d like to suggest a two-phase approach.
Phase 1: Document current behavior and set expectations
* Identify all ALTER TABLE actions involved in these inconsistencies.
* Update the ALTER TABLE and CREATE TABLE documentation to clearly describe the current behavior for partitioned tables, and (where appropriate) the intended or ideal behavior.
* Explicitly document the meaning of ONLY for partitioned tables, and note that some actions may behave differently, with details described in each action’s section.
Phase 2: Address actions incrementally
* Work on each ALTER TABLE action individually, recognizing that some may be straightforward while others require more design discussion, coding, and testing.
* With Phase 1 in place to set expectations, it may not be necessary to complete all actions in a single release.
* That said, it would still be desirable to keep the work bounded, for example within one or two major releases, to avoid long-term fragmentation.
I’ve included the participants from the previous discussions on CC, in case they want to comment further.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Pavlo Golub | 2025-12-29 08:31:36 | Re[2]: [PATCH] Add pg_current_vxact_id() function to expose virtual transaction IDs |
| Previous Message | zourenli | 2025-12-29 08:20:25 | [PATCH v1] replindex: Fix comment grammar in build_replindex_scan_key() |