Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-09-03 19:53:32
Message-ID: CAD21AoA7Y8JkVi-rSjjjYDnFNc0sjjXasu1Hb4UKOan3BSzUVQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 2, 2025 at 4:35 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Aug 29, 2025 at 9:38 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > I've attached the updated patch.
> >
>
> Few comments:

Thank you for the comments!

> =============
> 1.
> + * When XLogLogicalInfoActive() is true, guarantee that a subtransaction's
> + * xid can only be seen in the WAL stream if its toplevel xid has been
> + * logged before. If necessary we log an xact_assignment record with fewer
> + * than PGPROC_MAX_CACHED_SUBXIDS. Note that it is fine if didLogXid isn't
> + * set for a transaction even though it appears in a WAL record, we just
> + * might superfluously log something. That can happen when an xid is
> + * included somewhere inside a wal record, but not in XLogRecord->xl_xid,
> + * like in xl_standby_locks.
> */
> if (isSubXact && XLogLogicalInfoActive() &&
> !TopTransactionStateData.didLogXid)
>
> Instead of writing XLogLogicalInfoActive() is true in comments, can we
> say When effective wal_level is logical and then also point to some
> place if required where the patch has explained about effective
> wal_level? Otherwise, it sounds like we are writing what is apparent
> from code and may not be very clear.

Agreed.

>
> 2.
> - /*
> - * Invalidate logical slots if we are in hot standby and the primary
> - * does not have a WAL level sufficient for logical decoding. No need
> - * to search for potentially conflicting logically slots if standby is
> - * running with wal_level lower than logical, because in that case, we
> - * would have either disallowed creation of logical slots or
> - * invalidated existing ones.
> - */
> - if (InRecovery && InHotStandby &&
> - xlrec.wal_level < WAL_LEVEL_LOGICAL &&
> - wal_level >= WAL_LEVEL_LOGICAL)
> - InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
> - 0, InvalidOid,
> - InvalidTransactionId);
> -
> LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> ControlFile->MaxConnections = xlrec.MaxConnections;
> ControlFile->max_worker_processes = xlrec.max_worker_processes;
> @@ -8605,6 +8643,50 @@ xlog_redo(XLogReaderState *record)
> {
> /* nothing to do here, just for informational purposes */
> }
> + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
> + {
> + bool logical_decoding;
> +
> + /* Update the status on shared memory */
> + memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
> + UpdateLogicalDecodingStatus(logical_decoding, true);
> +
> + if (InRecovery && InHotStandby)
> + {
> + if (!logical_decoding)
>
> Like previously, shouldn't we have a check for standby's wal_level as
> well due to the reasons mentioned in the removed comments?

IIUC we need to replay the STATUS_CHANGE record when wal_level is set
to 'replica' or 'logical'. If we want to add a check for standby's
wal_level, the check would be "wal_level >= WAL_LEVEL_REPLICA" but it
would be redundant as we already checked "InRecovery && InHotStandby".

> 3.
> + errmsg("logical decoding needs to be enabled to publish logical changes"),
>
> This message doesn't sound intuitive. How about "logical decoding
> should be allowed to publish logical changes"?

Agreed.

>
> 4.
> + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
> ...
> + /*
> + * Request to launch or shutdown the slotsync worker depending on
> + * the new logical decoding status.
> + */
>
> If we see a similar part in existing code as a handling of
> XLOG_PARAMETER_CHANGE, we don't shutdown or restart slotsync worker,
> so why do it as part of this patch? This new behaviour may be better
> but shouldn't we try to handle it as a separate HEAD patch? Also, a
> few additional comments explaining the rationale behind this would be
> good.

Right, it seems out of scope. Removed that part.

> 5.
> Assert(RecoveryInProgress());
> ereport(ERROR,
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> - errmsg("logical decoding on standby requires \"wal_level\" >=
> \"logical\" on the primary")));
> + errmsg("logical decoding must be enabled on the primary")));
>
> Can't we keep the tone of the existing message as it is? How about
> "logical decoding on standby requires \"effective_wal_level\" >=
> \"logical\" on the primary"? Also, if we agree with this, we could
> have a similar change for other messages in the patch.

Agreed and changed all related places.

> 6. Can we write some comments as to why we didn't support wal_level to
> be changed from minimal to logical? It will be helpful for future
> readers/authors to understand what it would require to further extend
> this functionality.

Good point, I'll add the explanation.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-09-03 20:06:32 Re: index prefetching
Previous Message Andres Freund 2025-09-03 19:42:06 Re: Non-reproducible AIO failure