Re: Add a hook for handling logical decoding messages on subscribers.

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add a hook for handling logical decoding messages on subscribers.
Date: 2026-06-23 18:48:28
Message-ID: CAD21AoBMNKp9LOaEckNp5teYxJvPGd=9_ZJ-mxLBx+C=O78mbw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 22, 2026 at 3:39 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Fri, Jun 19, 2026 at 3:34 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Hi all,
> >
> > Commit ac4645c015 allows pgoutput to send logical decoding messages,
> > but it's limited to applications that use the pgoutput plugin -- the
> > built-in logical replication doesn't use it. I'd like to propose
> > introducing a hook to the logical replication message handling so that
> > extensions can plug in their own handling routine. This feature can be
> > used for extensions to implement DDL replication, function
> > replication, or trigger user-specific routines on the subscriber side.
>
> Thanks for working on this!
>
> > I've attached the PoC patch; it adds a hook function, and adds a new
> > 'message' subscription option that allows the user to request the
> > publisher to send logical decoding messages. Therefore, users need to
> > enable the 'message' option and set up the hook function at server
> > startup in order to receive the messages and trigger the hook
> > function.
>
> I understand the intent of the proposal, but I'd like to get the
> bigger picture first.
>
> Do we have any external modules that actually implement DDL
> replication (or any of the listed use-cases) with a similar hook? Or
> any existing discussion? I could be missing something because I
> haven't looked at all the DDL replication related threads.

Not that I'm aware of. That said, in past DDL replication discussions
several approaches were proposed (e.g. sending the DDL query text vs.
sending schema diffs in some form of intermediate representation),
each with its own pros and cons. So even once we have an in-core DDL
replication implementation, I think it's still plausible that an
extension would want to implement a different approach, and this hook
would let it do so without patching core.

>
> Another thing I'm curious about - why a hook? Is the plan to implement
> DDL replication as an external module rather than in core? If DDL
> replication eventually gets into core, I'd expect it to be apply-side
> logic executing the decoded DDL messages directly, not something going
> through a hook.

I'm working on in-core DDL replication too, and I agree that the
in-core version would be apply-side logic executing the decoded DDL
directly, not something going through this hook. The hook isn't meant
to be to implement the in-core DDL replication. It's there for things
core wouldn't cover: auditing replayed DDLs, propagating
context-specific information to the subscriber, or an extension
implementing a different DDL replication approach than the in-core
one.

> Why not a hook at apply_dispatch to give external modules more freedom
> with the pgoutput plugin?

It would let an extension intercept every message type
(INSERT/UPDATE/DELETE, COMMIT, etc.) and could easily break apply
consistency. At the same time, the incoming data is just the standard
logical replication protocol, so there isn't much an extension could
usefully do with the non-MESSAGE types beyond what core already does.

>
> > I I went with a hook function in the patch. While it lets you chain
> > the multiple hook functions, providing the registration API might be
> > better, or other types of registry can also be considered.
>
> It's hard to tell how many external modules would make use of this
> hook (rather, how many external modules implementing this hook one
> would allow to be installed in a production database requiring
> chaining), but my first thought is that a registration-based API along
> the lines of RegisterXactCallback would be cleaner and work better.

Yeah, it would be better.

>
> > Feedback is very welcome.
>
> A few comments on the patch:
>
> 1/
> + bool message; /* True if the subscription wants to receive
> + * logical messages */
> } Subscription;
>
> Nit: I'd call these logical decoding messages or generic logical
> messages - something to match the docs and pg_logical_emit_message.

Will fix.

> 2/
> +void
> +test_logical_message_handler(LogicalRepMessageData *msg)
> +{
> + ereport(LOG,
> + (errmsg("received message: LSN %X/%08X, prefix: %s, message: %s",
> + LSN_FORMAT_ARGS(msg->lsn),
> + msg->prefix,
> + msg->message)));
> +}
>
> Why not have the test extension do a simple DDL/function/event-trigger
> based replication? It doesn't need to be a full-blown implementation,
> but to show the usefulness of this hook, it's better to have one
> demonstrating the listed use-cases.

I'd prefer to keep this test module minimal, since its purpose is to
improve coverage of the newly added code. A worked example that
demonstrates the usefulness of the hook is valuable, but I think it
belongs in a separate contrib module rather than in the test module,
and that's probably a separate discussion.

>
> 3/
> + if (subinfo->submessage)
> + appendPQExpBufferStr(query, ", message = true");
>
> Why a subscription-level option? Why not leave the decision of whether
> or not to act on the message to the external module implementers?

This is because not to affect the existing logical replication users.
Since the logical decoding messages are not sent today, I think that
it should be an explcit opt-in feature for users who don't want to
allow the publisher to send logical decoding messages.

>
> 4/
> + /*
> + * Logical messages are handled only the (parallel) apply workers
> + */
> + if (am_tablesync_worker() || am_sequencesync_worker())
> + return;
>
> Why these restrictions? Why not leave the decision to external module
> implementers? Isn't this limiting - what if someone wants to use this
> hook for the schema sync during the initial table sync phase?

Logical decoding messages are relation-agnostic, so they don't map
cleanly onto the table sync phase. If every tablesync worker processed
the messages they'd be handled multiple times, and if only one did,
there'd be no well-defined ordering between a message and the initial
copy progress. So I think letting tablesync workers run the hook seems
more confusing than useful.

>
> 5/ With this change, pg_logical_emit_message does affect the logical
> replication apply if the subscriber has defined this hook. I think
> it's worth mentioning in the docs for pg_logical_emit_message.

Agreed.

Regards,

--
Masahiko Sawada

Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2026-06-23 18:58:53 Re: [GSoC 2026] - B-tree Index Bloat Reduction - Approach & Questions
Previous Message Tom Lane 2026-06-23 18:47:29 Re: Fix variadic argument types for pg_get_xxx_ddl() functions