Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-11-14 09:43:53
Message-ID: CAD21AoCPOo5fOggPHkd6BePjWBFm_nUK7Dk36EGsqEW+n+Hu8w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 14, 2025 at 1:25 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> In an offline discussion with Kuroda-san, we realized that TRUNCATE
> may hit Assert(XLogLogicalInfoActive()) in ExecuteTruncateGuts() under
> our current implementation, where logical decoding is disabled lazily.
>
> Consider the case where there’s only one logical slot and we attempt
> to drop it. The backend issues the drop request, but before the
> checkpointer actually disables logical decoding, a TRUNCATE is
> executed. Since logical decoding is still marked as active at that
> moment, the ExecuteTruncate() appends the OID to relids_logged.
> However, by the time control reaches ExecuteTruncateGuts, the
> checkpointer has already disabled logical decoding resulting in
> Assert.
>
> TRAP: failed Assert("XLogLogicalInfoActive()"), File: "tablecmds.c",

Good find. I think this assertion is no longer valid given that
XLogLogicalInfoActive() can change dynamically. It's safe to write
logica information to WAL records even if it's not strictly required,
so we can keep writing logical info even if XLogLogicalInfoActive()
comes to return false in the middle. In the opposite case where
logical decoding becomes enabled in the middle, a similar thing
happens but we will wait for such a transaction to finish before
starting the logical decoding. Therefore, we can remove the assertion.
What do you think?

I'll check if there are other similar issues due to this patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-11-14 10:39:38 Re: Issue with logical replication slot during switchover
Previous Message Amit Kapila 2025-11-14 09:38:25 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart