Re: Proposal: Conflict log history table for Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Proposal: Conflict log history table for Logical Replication
Date: 2025-12-04 10:50:22
Message-ID: CAA4eK1+b2Ws0e_ZYJsgZAPn7VWndxAK_YM_QMKcfXst3e7F6Jg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Thu, Dec 4, 2025 at 7:31 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
>
> > The patch utilizes SPI for creating and dropping the conflict history
> > table, but I'm really not sure if it's okay because it's actually
> > affected by some GUC parameters such as default_tablespace and
> > default_toast_compression etc. Also, probably some hooks and event
> > triggers could be fired during the creation and removal. Is it
> > intentional behavior? I'm concerned that it would make investigation
> > harder if an issue happened in the user environment.
>
> Hmm, interesting point, well we can control the value of default
> parameters while creating the table using SPI, but I don't see any
> reason to not use heap_create_with_catalog() directly, so maybe that's
> a better choice than using SPI because then we don't need to bother
> about any event triggers/utility hooks etc. Although I don't see any
> specific issue with that, unless the user intentionally wants to
> create trouble while creating this table. What do others think about
> it?
>
> > ---
> > + /* build and execute the CREATE TABLE query. */
> > + appendStringInfo(&querybuf,
> > + "CREATE TABLE %s.%s ("
> > + "relid Oid,"
> > + "schemaname TEXT,"
> > + "relname TEXT,"
> > + "conflict_type TEXT,"
> > + "remote_xid xid,"
> > + "remote_commit_lsn pg_lsn,"
> > + "remote_commit_ts TIMESTAMPTZ,"
> > + "remote_origin TEXT,"
> > + "key_tuple JSON,"
> > + "remote_tuple JSON,"
> > + "local_conflicts JSON[])",
> > + quote_identifier(get_namespace_name(namespaceId)),
> > + quote_identifier(conflictrel));
> >
> > If we want to use SPI for history table creation, we should use
> > qualified names in all the places including data types.
>
> That's true, so that we can avoid interference of any user created types.
>
> > ---
> > The patch doesn't create the dependency between the subscription and
> > the conflict history table. So users can entirely drop the schema
> > (with CASCADE option) where the history table is created.
>
> I think as part of the initial discussion we thought since it is
> created under the subscription owner privileges so only that user can
> drop that table and if the user intentionally drops the table the
> conflict will not be recorded in the table and that's acceptable. But
> now I think it would be a good idea to maintain the dependency with
> subscription so that users can not drop it without dropping the
> subscription.
>

Yeah, it seems reasonable to maintain its dependency with the
subscription in this model. BTW, for this it would be easier to record
dependency, if we use heap_create_with_catalog() as we do for
create_toast_table(). The other places where we use SPI interface to
execute statements are either the places where we need to execute
multiple SQL statements or non-CREATE Table statements. So, for this
patch's purpose, I feel heap_create_with_catalog() suits more.

I was also thinking whether it is a good idea to create one global
conflict table and let all subscriptions use it. However, it has
disadvantages like whenever, user drops any subscription, we need to
DELETE all conflict rows for that subscription causing the need for
vacuum. Then we somehow need to ensure that conflicts from one
subscription_owner are not visible to other subscription_owner via
some RLS policy. So, catalog table per-subscription (aka) the current
way appears better.

Also, shall we give the option to the user where she wants to see
conflict/resolution information? One idea to achieve the same is to
provide subscription options like (a) conflict_resolution_format, the
values could be log and table for now, in future, one could extend it
to other options like xml, json, etc. (b) conflict_log_table: in this
user can specify the conflict table name, this can be optional such
that if user omits this and conflict_resolution_format is table, then
we will use internally generated table name like
pg_conflicts_<subscription_id>.

> And once
> > dropping the schema along with the history table, ALTER SUBSCRIPTION
> > ... SET (conflict_history_table = '') seems not to work (I got a
> > SEGV).
>
> I will check this, thanks
>
> > ---
> > We can create the history table in pg_temp namespace but it should not
> > be allowed.
>
> Right, will check this and also add the test for the same.
>
> > ---
> > I think the conflict history table should not be transferred to the
> > new cluster when pg_upgrade since the table definition could be
> > different across major versions.
>
> Let me think more on this with respect to behaviour of other factors
> like subscriptions etc.
>

Can we deal with different schema of tables across versions via
pg_dump/restore during upgrade?

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-12-04 10:52:07 Re: headerscheck ccache support
Previous Message Heikki Linnakangas 2025-12-04 10:39:43 Re: POC: make mxidoff 64 bits