Re: Proposal: Conflict log history table for Logical Replication

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Proposal: Conflict log history table for Logical Replication
Date: 2025-09-26 11:12:11
Message-ID: CAFiTN-tQiakd8m+-d6WN6RpJXSv_JcropZ2oGzme4d1JudQhYg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 25, 2025 at 4:19 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Thu, Sep 25, 2025 at 11:53 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > > [1]
> > > /*
> > > * For logical decode we need combo CIDs to properly decode the
> > > * catalog
> > > */
> > > if (RelationIsAccessibleInLogicalDecoding(relation))
> > > log_heap_new_cid(relation, &tp);
> > >
> >
> > Meanwhile I am also exploring the option where we can just CREATE TYPE
> > in initialize_data_directory() during initdb, basically we will create
> > this type in template1 so that it will be available in all the
> > databases, and that would simplify the table creation whether we
> > create internally or we allow user to create it. And while checking
> > is_publishable_class we can check the type and avoid publishing those
> > tables.
> >
>
> Based on my off list discussion with Amit, one option could be to set
> HEAP_INSERT_NO_LOGICAL option while inserting tuple into conflict
> history table, for that we can not use SPI interface to insert instead
> we will have to directly call the heap_insert() to add this option.
> Since we do not want to create any trigger etc on this table, direct
> insert should be fine, but if we plan to create this table as
> partitioned table in future then direct heap insert might not work.

Upon further reflection, I realized that while this approach avoids
streaming inserts to the conflict log history table, it still requires
that table to exist on the subscriber node upon subscription creation,
which isn't ideal.

We have two main options to address this:

Option1:
When calling pg_get_publication_tables(), if the 'alltables' option is
used, we can scan all subscriptions and explicitly ignore (filter out)
all conflict history tables. This will not be very costly as this
will scan the subscriber when pg_get_publication_tables() is called,
which is only called during create subscription/alter subscription on
the remote node.

Option2:
Alternatively, we could introduce a table creation option, like a
'non-publishable' flag, to prevent a table from being streamed
entirely. I believe this would be a valuable, independent feature for
users who want to create certain tables without including them in
logical replication.

I prefer option2, as I feel this can add value independent of this patch.

--
Regards,
Dilip Kumar
Google

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-09-26 11:13:19 Re: Report bytes and transactions actually sent downtream
Previous Message Vitaly Davydov 2025-09-26 11:10:02 Re: Exit walsender before confirming remote flush in logical replication