Re: Proposal: Conflict log history table for Logical Replication

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shvetamalik(at)gmail(dot)com>
Subject: Re: Proposal: Conflict log history table for Logical Replication
Date: 2026-06-22 04:03:44
Message-ID: CALDaNm20GcE2p8GfiTWdrp9T=1LTvxpws9djUkJBwZO_nyqqdQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 22 Jun 2026 at 08:41, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sun, Jun 21, 2026 at 7:53 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > While attempting to log a conflict, a concurrent ALTER SUBSCRIPTION
> > can change the conflict logging destination from all to log. In this
> > scenario, the apply worker may already have cached the conflictlogdest
> > information, including the OID of the current conflict log table.
> > However, the concurrent ALTER SUBSCRIPTION drops the conflict log
> > table as part of the destination change:
> > +Relation
> > +GetConflictLogDestAndTable(ConflictLogDest *log_dest)
> > +{
> > + Oid conflictlogrelid;
> > +
> > + /*
> > + * Convert the text log destination to the internal enum.
> > MySubscription
> > + * already contains the data from pg_subscription.
> > + */
> > + *log_dest = GetConflictLogDest(MySubscription->conflictlogdest);
> > +
> > + /* Quick exit if a conflict log table was not requested. */
> > + if (!CONFLICTS_LOGGED_TO_TABLE(*log_dest))
> > + return NULL;
> > +
> > + conflictlogrelid = MySubscription->conflictlogrelid;
> > +
> > + Assert(OidIsValid(conflictlogrelid));
> > +
> > + return table_open(conflictlogrelid, RowExclusiveLock);
> > +}
> >
> > As a result, when the apply worker later attempts to open the cached
> > conflict log table, table_open() fails because the relation has
> > already been dropped. This causes the error handling path itself to
> > fail before the conflict record can be written to either the conflict
> > log table or the server log.
> >
> > In such cases, the conflict record is effectively lost and is not
> > logged anywhere. For example:
> > 2026-06-21 19:31:13.592 IST [263598] LOG: logical replication apply
> > worker for subscription "sub1" has started
> > 2026-06-21 19:32:26.731 IST [263598] ERROR: could not open relation
> > with OID 16405
> > 2026-06-21 19:32:26.731 IST [263598] CONTEXT: processing remote data
> > for replication origin "pg_16404" during message type "INSERT" for
> > replication target relation "public.t1" in transaction 698, finished
> > at 0/017D39A0
> > 2026-06-21 19:32:26.735 IST [263471] LOG: background worker "logical
> > replication apply worker" (PID 263598) exited with exit code 1
> >
> > Ideally, failure to access the conflict log table should not prevent
> > the conflict from being reported in the server log. This issue is
> > present with the v52 version. I have not yet checked if Amit's recent
> > patch posted a few minutes ago at [1] handles this issue.
> >
>
> There are two places in the patch from where we LOG/Insert the
> conflict data. First is ReportApplyConflict() where we LOG if the
> conflict arises from a non-ERROR path (aka conflicts other
> INSERT/UPDATE_EXISTS). In that case, the conflict data will be logged
> even when we fail to insert into CLT. Second is the place for
> conflicts that arose as ERRORs (aka INSERT/UPDATE_EXISTS), where the
> conflict information will be logged along with insert failure as
> CONTEXT. Can you please verify your test based on this input and share
> your findings and thoughts?

The scenario I am testing is an insert_exists conflict.
On the publisher:
CREATE TABLE t1 (c1 int);

On the subscriber:
CREATE TABLE t1 (c1 int PRIMARY KEY);

Then execute the following on the publisher:
INSERT INTO t1 VALUES (10);
INSERT INTO t1 VALUES (10);

The second insert generates an insert_exists conflict on the
subscriber. The conflict is reported and logged through the following
call chain:
apply_handle_insert
-> apply_handle_insert_internal
-> ExecSimpleRelationInsert
-> CheckAndReportConflict
-> ReportApplyConflict

Pause execution in ReportApplyConflict() at
GetConflictLogDestAndTable(), immediately before opening the conflict
log table:
...
return table_open(conflictlogrelid, RowExclusiveLock);
...

While the apply worker is paused, execute the following command concurrently:
ALTER SUBSCRIPTION sub1
SET (conflict_log_destination = 'log');

This succeeds and drops the conflict log table:
NOTICE: dropped conflict log table "pg_conflict.pg_conflict_log_16404"
for subscription "sub1"
ALTER SUBSCRIPTION

At this point, GetConflictLogDestAndTable() has already determined
that the conflict should be logged to a table and has cached the
corresponding relation OID. However, the concurrent ALTER SUBSCRIPTION
has removed that table.

When execution resumes, the subsequent table_open() call fails with:
2026-06-22 09:24:53.072 IST [304864] ERROR: could not open relation
with OID 16405

As a result, conflict processing itself fails before the conflict
details can be recorded. The conflict is therefore not logged to the
conflict log table and is also not emitted to the server log.

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2026-06-22 04:26:42 Re: [PATCH] Change wait_time column of pg_stat_lock to double precision
Previous Message Chao Li 2026-06-22 03:59:48 bytea(uuid) missing proleakproof?