Re: Proposal: Conflict log history table for Logical Replication

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Proposal: Conflict log history table for Logical Replication
Date: 2026-07-03 03:38:45
Message-ID: CAJpy0uA+ZcDr6vgmWEQBLEkANxt88sQcg_2JJA7bCzM4p1enhw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 2, 2026 at 5:30 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Thu, Jul 2, 2026 at 4:48 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> >
> > On Thu, 2 Jul 2026 at 13:39, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Jul 1, 2026 at 6:55 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > >
> > > > On Wed, Jul 1, 2026 at 1:16 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Wed, Jul 1, 2026 at 12:45 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > Thanks I have fixed these comments and also merged the docs.
> > > > >
> > > > > Thanks. Sharing the version with pgindent run on v61.
> > > >
> > > > Here is the updated patch version. It fixes offlist review comments
> > > > provided by Amit and Shveta, bumps the catversion and adds authors and
> > > > reviewers details.
> > >
> > > I rebased the remaining patches on top of HEAD. So far, I have run
> > > pgindent and completed the doc merge for 0001. The 0002 and 0003 are
> > > just rebased.
> >
> > Hi Dilip,
> >
> > While testing I hit the following assert:
> > + if (OidIsValid(relid))
> > + {
> > + /* Existing table from upgrade */
> > + Assert(IsBinaryUpgrade);
> > + }
> >
> > Steps to reproduce:
> > 1. Suppose initially we have a logical replication setup and the
> > subscriber is created with 'conflict_log_destination = log'
> > 2. Now we have two psql sessions of the subscriber node. Attach GDB to
> > first session and add breakpoint to line:
> > AlterSubscription->LockSharedObject(SubscriptionRelationId, subid, 0,
> > AccessExclusiveLock);
> > 3.Execute 'ALTER SUBSCRIPTION sub1 SET (conflict_log_destination =
> > 'all')' on the first session. It will stop at the breakpoint.
> > 4. Now execute 'ALTER SUBSCRIPTION sub1 SET (conflict_log_destination
> > = 'all');' on the second session. The execution on the second session
> > is successful and a conflict log table is created.
> > 5. Continue the execution of the first session. This will hit the
> > above Assert condition.
> >
> > I think this happens because for the first session, we have the stale
> > copy of 'sub'.
> > While the first session is stopped at the breakpoint, the second
> > session successfully executes ALTER SUBSCRIPTION and creates the
> > conflict log table. When the first session resumes, it still sees the
> > old value of old_dest (i.e CONFLICT_LOG_DEST_LOG), so
> > 'opts.conflictlogdest != old_dest' evaluates to true and
> > 'alter_sub_conflict_log_dest()' is invoked.
> > Since the second session has already created the conflict log table
> > and 'want_table' is true, the code finds an existing conflict log
> > table '(OidIsValid(relid))', causing the assert to fire.
>
> In short, if I understand correctly, the issue is that altering a
> subscription that tries to create the table might hit this assert
> because we assume that if we are switching from 'log' to 'table' or
> 'all', there should not be an existing table except in binary upgrade
> mode. This happens because we read the destination information from
> subscription cache while checking the table status in real time. In
> fact in other cases, it might cause alter subscription to error out if
> the table is created while we are creating the table. Therefore, we
> need to handle concurrent table creation during ALTER SUBSCRIPTION.
> One option is: if the table exists and it's not a binary upgrade,
> should we just error out?
>

No, I feel it should be Assert only. IMO, we should re-fetch 'sub'
beofre CLT creation or move first 'sub' fetching itself post
LockSharedObject if there is nothing blocking it. Thoughts?

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mario González Troncoso 2026-07-03 03:47:58 Possible replace of strncpy on xactdesc.c
Previous Message Richard Guo 2026-07-03 03:36:09 Re: pg_plan_advice: FOREIGN_JOIN advice generated for a single-relation foreign scan is not round-trip safe