| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
| Cc: | Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Support logical replication of DDLs, take2 |
| Date: | 2026-04-28 21:38:35 |
| Message-ID: | CAD21AoDn5ij+A5=maFh03Xkcp7cPxmhGPrGPvboTHvYgNP+h_g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general pgsql-hackers |
On Sun, Apr 26, 2026 at 11:15 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Apr 21, 2026 at 4:45 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
>
> > We made a lot of effort on this feature through 2022 and 2023, but the
> > development is currently inactive. The last patch was submitted on Jul
> > 18, 2023. I've reviewed the previous patches and discussions, and I
> > would like to summarize how DDL replication was implemented, the main
> > reasons it stalled, and propose an alternative design to address those
> > problems.
> >
> > The overall idea of the previous patch set was to implement DDL
> > deparsing and utilize it for DDL replication. It converted a parse
> > tree into a JSON string. For instance, if a user executes "DROP TABLE
> > t1", the deparser generates from its parse tree:
> >
> > {DROPSTMT :objects (("t1")) :removeType 41 :behavior 0 :missing_ok
> > false :concurrent false}
> >
> > to:
> >
> > {"fmt": "DROP TABLE %{objidentity}s", "objidentity": "public.t1"}
> >
> > This JSON string is self-documenting, meaning someone who gets it can
> > easily reconstruct the original DDL with schema-qualified object
> > names. In a dedicated event trigger for logical replication, we
> > deparsed the parse tree of a DDL, wrote it into a WAL record, and then
> > the logical decoding processed it similarly to DML changes.
>
> I think there was also a discussion on whether to use JSON vs the
> existing infrastructure of converting nodes to strings. Although the
> JSON is standard format and might provide more flexibility using
> existing format avoid extra maintence burden.
Right, but since the string representation of nodes are major version
dependent, we cannot directly send them to subscribers that might be
different major versions.
>
> > While there are several benefits to the JSON data approach mentioned
> > in the wiki [1] -- most notably the flexibility to easily remap
> > schemas (e.g., mapping "schema A" on the publisher to "schema B" on
> > the subscriber) -- there was a major concern: the huge maintenance
> > burden. We would need to maintain the JSON serialization code whenever
> > creating or modifying parse nodes, regardless of whether the changes
> > were related to DDL replication. IIUC, this was the primary reason the
> > feature didn't cross the finish line.
>
> Do you mean that modifying any existing parse node requires changing
> the JSON serialization code? But why do we need to do that if that's
> not related to DDL? I don't think I understood this point clearly,
> can you explain it or point me to the discussion thread?
IIUC, many parse nodes need to support JSON serialization even if we
only want to support CREATE/ALTER/DROP TABLE commands. This is because
these commands can include expressions (e.g., in DEFAULT clauses or
CHECK constraints), function calls, and column references. We would
need to recursively deparse the entire expression tree.
>
> > Additionally, I think there is another design issue: it is not
> > output-plugin agnostic. Since the deparsed DDL was written by a
> > logical-replication-specific event trigger, third-party logical
> > decoding plugins cannot easily detect DDL events. Ideally, we should
> > write DDL information into a WAL record natively when
> > wal_level='logical' (or additionally when a GUC enables DDL events
> > WAL-logging) so that all decoding plugins can detect them. This also
> > allows us to test DDL logical decoding with test_decoding without
> > setting up a full logical replication subscription.
>
> Yeah, that's a valid point, but I think we could separate the event
> trigger logic from the decoding plugins so that it's available for any
> other output plugins?
Yes, it's possible. But I'm concerned that the operations users (or
plugins) would need to enable DDL events would be quite different from
capturing DMLs. I'm not sure if it's a good user experience or plugin
developer experience that additional steps are required to capture DDL
events while changes of INSERT/UPDATE/DELETE/TRUNCATE are passed from
the logical decoding by default.
>
> > To address these two points, I'd like to propose an alternative
> > approach: we introduce a new data type, say DDLCommand, that is
> > self-contained to represent a DDL (like CollectedCommand), and don't
> > rely on event triggers. It would have the command type (and subtype if
> > required), the OIDs of the target object and its namespace, and the
> > OID of the user who executed the DDL. We write it to a new WAL record
> > at appropriate places during DDL execution, and the logical decoding
> > layer passes the data to output plugins. That way, any logical
> > decoding plugin can detect DDL changes, and it's up to the plugins how
> > to decode the DDL information.
>
> Interesting. I'm trying to figure out exactly when we plan to
> construct this new DDLCommand data type? And how would we prepare
> this, by converting the internal DDL structures or from parsetree?
I'm still unsure what is the best approach but I think we can
construct the DDLCommand data from the internal DDL structures, and
each DDL command can call the function to write the DDL command
information to a WAL record.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2026-04-28 21:48:31 | Re: Support logical replication of DDLs, take2 |
| Previous Message | Hannu Krosing | 2026-04-28 19:55:07 | Re: Support logical replication of DDLs, take2 |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2026-04-28 21:48:31 | Re: Support logical replication of DDLs, take2 |
| Previous Message | Gurmokh | 2026-04-28 21:21:35 | Re: New vacuum config to avoid anti wraparound vacuums |