Re: Proposal: Conflict log history table for Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Proposal: Conflict log history table for Logical Replication
Date: 2025-09-27 21:13:39
Message-ID: CAA4eK1L5ZeGtcdmkyKnu1LnPAcC4X=ahRiXh=_yejT=SMDHLsA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 27, 2025 at 9:24 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Sat, Sep 27, 2025 at 8:53 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > I am not able to understand what exact problem you are seeing here. I
> > was thinking that during the CREATE SUBSCRIPTION command, a new table
> > with user provided name will be created similar to how we create a
> > slot. The difference would be that we create a slot on the
> > remote/publisher node but this table will be created locally.
> >
> That's not an issue, the problem here we are discussing is the
> conflict history table which is created on the subscriber node should
> not be published when this node subscription node create another
> publisher with ALL TABLE option. So we found a option for inserting
> into this table with HEAP_INSERT_NO_LOGICAL flag so that those insert
> will not be decoded, but what about another not subscribing from this
> publisher, they should have this table because when ALL TABLES are
> published subscriber node expect all user table to present there even
> if its changes are not published. Consider below example
>
> Node1:
> CREATE PUBLICATION pub_node1..
>
> Node2:
> CREATE SUBSCRIPTION sub.. PUBLICATION pub_node1
> WITH(conflict_history_table='my_conflict_table');
> CREATE PUBLICATION pub_node2 FOR ALL TABLE;
>
> Node3:
> CREATE SUBSCRIPTION sub1.. PUBLICATION pub_node2; --this will expect
> 'my_conflict_table' to exist here because when it will call
> pg_get_publication_tables() from Node2 it will also get the
> 'my_conflict_table' along with other user tables.
>
> And as a solution I wanted to avoid this table to be avoided when
> pg_get_publication_tables() is being called.
> Option1: We can see if table name is listed as conflict history table
> in any of the subscribers on Node2 we will ignore this.
> Option2: Provide a new table option to mark table as non publishable
> table when ALL TABLE option is provided, I think this option can be
> useful independently as well.
>

I agree that option-2 is useful and IIUC, we are already working on
something similar in thread [1]. However, it is better to use option-1
here because we are using non-user specified mechanism to skip changes
during replication, so following the same during other times is
preferable. Once we have that other feature [1], we can probably
optimize this code to use it without taking input from the user. The
other reason of not going with the option-2 in the way you are
proposing is that it doesn't seem like a good idea to have multiple
ways to specify skipping tables from publishing. I find the approach
being discussed in thread [1] a generic and better than a new
table-level option.

[1] - https://www.postgresql.org/message-id/CANhcyEVt2CBnG7MOktaPPV4rYapHR-VHe5%3DqoziTZh1L9SVc6w%40mail.gmail.com
--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-09-27 22:23:16 Re: [PATCH] GROUP BY ALL
Previous Message Arseniy Mukhin 2025-09-27 18:39:07 Re: BackgroundPsql swallowing errors on windows