| From: | vignesh C <vignesh21(at)gmail(dot)com> |
|---|---|
| To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Proposal: Conflict log history table for Logical Replication |
| Date: | 2025-12-12 05:57:18 |
| Message-ID: | CALDaNm21+2bpA634D6Nk_yd=f3Ub_G_n0pfNCbVZvFiwQZo97g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, 11 Dec 2025 at 19:50, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Thu, Dec 11, 2025 at 5:57 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > > 2)
> > > > When we do below:
> > > > alter subscription sub1 SET (conflict_log_table=clt2);
> > > >
> > > > the previous conflict log table is dropped. Is this behavior
> > > > intentional and discussed/concluded earlier? It’s possible that a user
> > > > may want to create a new conflict log table for future events while
> > > > still retaining the old one for analysis. If the subscription itself
> > > > is dropped, then dropping the CLT makes sense, but I’m not sure this
> > > > behavior is intended for ALTER SUBSCRIPTION. I do understand that
> > > > once we unlink CLT from subscription, later even DROP subscription
> > > > cannot drop it, but user can always drop it when not needed.
> > > >
> > > > If we plan to keep existing behavior, it should be clearly documented
> > > > in a CAUTION section, and the command should explicitly log the table
> > > > drop.
> > >
> > > Yeah we discussed this behavior and the conclusion was we would
> > > document this behavior and its user's responsibility to take necessary
> > > backup of the conflict log table data if they are setting a new log
> > > table or NONE for the subscription.
> > >
> >
> > +1. If we don't do this then it will be difficult to track for
> > postgres or users the previous conflict history tables.
>
> Right, it makes sense.
>
> Attached patch fixed most of the open comments
> 1) \dRs+ now show the schema qualified name
> 2) Now key_tuple and replica_identify tuple both are add in conflict
> log tuple wherever applicable
> 3) Refactored the code so that we can define the conflict log table
> schema only once in the header file and both create_conflict_log_table
> and ValidateConflictLogTable use it.
>
> I was considering the interdependence between the subscription and the
> conflict log table (CLT). IMHO, it would be logical to establish the
> subscription as dependent on the CLT. This way, if someone attempts to
> drop the CLT, the system would recognize the dependency of the
> subscription and prevent the drop unless the subscription is removed
> first or the CASCADE option is used.
>
> However, while investigating this, I encountered an error [1] stating
> that global objects are not supported in this context. This indicates
> that global objects cannot be made dependent on local objects.
> Although making an object dependent on global/shared objects is
> possible for certain types of shared objects [2], this is not our main
> objective.
>
> We do not need to make the CLT dependent on the subscription because
> the table can be dropped when the subscription is dropped anyway and
> we are already doing it as part of drop subscription as well as alter
> subscription when CLT is set to NONE or a different table. Therefore,
> extending the functionality of shared dependency is unnecessary for
> this purpose.
I noticed an inconsistency in the checks that prevent adding a
conflict log table to a publication. At creation time, we explicitly
reject attempts to publish a conflict log table:
/* Can't be conflict log table */
if (IsConflictLogTable(RelationGetRelid(targetrel)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("cannot add relation \"%s.%s\" to publication",
get_namespace_name(RelationGetNamespace(targetrel)),
RelationGetRelationName(targetrel)),
errdetail("This operation is not supported for conflict
log tables.")));
However, the restriction can be bypassed through a sequence of table
renames like below:
-- Set up logical replication
CREATE PUBLICATION pub_all;
CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_all WITH
(conflict_log_table = 'conflict');
-- Rename the conflict log table
ALTER TABLE conflict RENAME TO conflict1;
-- Now this succeeds:
CREATE PUBLICATION pub1 FOR TABLE conflict1;
-- Rename it back
ALTER TABLE conflict1 RENAME TO conflict;
\dRp+ pub1
Publication pub1
...
Tables:
public.conflict
Thus, although we prohibit publishing the conflict log table directly,
a publication can still end up referencing it through renaming. This
is inconsistent with the invariant the code attempts to enforce.
Should we extend the checks to handle renames so that a conflict log
table can never end up in a publication?
Alternatively, should the creation-time restriction be relaxed if this
case is acceptable?
If the invariant should be enforced, should we also prevent renaming a
conflict-log table into a published table's name?
Thoughts?
Regards,
Vignesh
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2025-12-12 06:11:31 | Re: Import Statistics in postgres_fdw before resorting to sampling. |
| Previous Message | Noah Misch | 2025-12-12 05:05:18 | Re: Add WALRCV_CONNECTING state to walreceiver |