| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> |
| Cc: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Support logical replication of DDLs, take2 |
| Date: | 2026-04-20 23:14:32 |
| Message-ID: | CAD21AoCzT3sytVbimRNdjRF=N3R-8ddaWKW95EzsdderXqcm4g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general pgsql-hackers |
Hi,
On Mon, Feb 23, 2026 at 5:21 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
>
> One idea I'm experimenting with is that we define an abstract data
> type that can represent a DDL (like CollectedCommand) and write it to
> a new WAL record so that logical decoding processes it. For CREATE
> DDLs, we can use pg_get_xxx_def() function while using a historical
> snapshot to get the DDLs. We would need to implement the codes to
> generate DROP and ALTER DDLs from the data. I believe DROP DDLs would
> not be hard. For ALTER DDLs, we would incur the initial implementation
> costs, but we would not change these codes often.
>
DDL support for logical replication is one of the biggest missing
pieces in logical replication. I'd like to resume this work for PG20.
We made a lot of effort on this feature through 2022 and 2023, but the
development is currently inactive. The last patch was submitted on Jul
18, 2023. I've reviewed the previous patches and discussions, and I
would like to summarize how DDL replication was implemented, the main
reasons it stalled, and propose an alternative design to address those
problems.
The overall idea of the previous patch set was to implement DDL
deparsing and utilize it for DDL replication. It converted a parse
tree into a JSON string. For instance, if a user executes "DROP TABLE
t1", the deparser generates from its parse tree:
{DROPSTMT :objects (("t1")) :removeType 41 :behavior 0 :missing_ok
false :concurrent false}
to:
{"fmt": "DROP TABLE %{objidentity}s", "objidentity": "public.t1"}
This JSON string is self-documenting, meaning someone who gets it can
easily reconstruct the original DDL with schema-qualified object
names. In a dedicated event trigger for logical replication, we
deparsed the parse tree of a DDL, wrote it into a WAL record, and then
the logical decoding processed it similarly to DML changes.
While there are several benefits to the JSON data approach mentioned
in the wiki [1] -- most notably the flexibility to easily remap
schemas (e.g., mapping "schema A" on the publisher to "schema B" on
the subscriber) -- there was a major concern: the huge maintenance
burden. We would need to maintain the JSON serialization code whenever
creating or modifying parse nodes, regardless of whether the changes
were related to DDL replication. IIUC, this was the primary reason the
feature didn't cross the finish line.
Additionally, I think there is another design issue: it is not
output-plugin agnostic. Since the deparsed DDL was written by a
logical-replication-specific event trigger, third-party logical
decoding plugins cannot easily detect DDL events. Ideally, we should
write DDL information into a WAL record natively when
wal_level='logical' (or additionally when a GUC enables DDL events
WAL-logging) so that all decoding plugins can detect them. This also
allows us to test DDL logical decoding with test_decoding without
setting up a full logical replication subscription.
To address these two points, I'd like to propose an alternative
approach: we introduce a new data type, say DDLCommand, that is
self-contained to represent a DDL (like CollectedCommand), and don't
rely on event triggers. It would have the command type (and subtype if
required), the OIDs of the target object and its namespace, and the
OID of the user who executed the DDL. We write it to a new WAL record
at appropriate places during DDL execution, and the logical decoding
layer passes the data to output plugins. That way, any logical
decoding plugin can detect DDL changes, and it's up to the plugins how
to decode the DDL information.
In pgoutput, for CREATE DDLs, we can use the pg_get_xxx_ddl()
functions while using a historical snapshot to get the DDLs, saving
maintenance costs. We would still need to implement the code to
generate DROP and ALTER DDLs from the data. I believe DROP DDLs would
not be hard. For ALTER DDLs, we would incur an initial implementation
cost, but we would not need to change this code often. We can
implement the DDL generation code in a way that improves ddlutils.c.
Also, because DDLCommand is separated from parse nodes, we only need
to change the DDL deparse/replication code when it is actually needed.
Additionally, this approach would eliminate the code around the
two-step process (using DCT_TableDropStart and DCT_TableDropEnd) for
DROP TABLE. While it would miss the flexibility benefits that the JSON
deparsing approach has, I guess it would not be very hard to implement
the mapping in the deparse layer even without the JSON data.
Regarding the publication syntax, previous patches proposed:
CREATE PUBLICATION pub FOR ALL TABLES WITH (ddl = 'table');
While simple, it doesn't support critical enterprise use cases (e.g.,
DWH environments) where users want to replicate CREATE and ALTER, but
explicitly filter out DROP TABLE to prevent accidental data loss. We
should consider introducing publish_ddl options to filter operations:
CREATE PUBLICATION pub FOR ALL TABLES WITH (publish_ddl = 'create, alter');
I have implemented the basic idea with the above changes and it seems
to work well, though the patch is not yet ready to share. I'd like to
resume the discussion to move this project forward. My initial MVP
goal is to support CREATE/ALTER/DROP TABLE, which covers the vast
majority of use cases, and incrementally extend support for other
object types later.
FYI I've experimented with auto-generation approaches too. For
instance, gen_node_support.pl generates C code that converts parse
nodes to the corresponding text representations. Or
gen_node_support.pl generates C code that makes all objects in the
given SQL query text fully-schema qualified. While these ideas are
promising they didn't help reduce the maintenance burden much as the
parse node definitions are already complex and vary on nodes much.
Thank you for taking the time to read through this long email.
Regards,
[1] https://wiki.postgresql.org/wiki/Logical_replication_of_DDLs#JSONB_Benefits
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2026-04-20 23:41:38 | Re: Cleanup: Use modern macro for text-to-CString conversion in plsample.c |
| Previous Message | Peter Eisentraut | 2026-04-20 20:50:29 | Re: Eliminating SPI / SQL from some RI triggers - take 3 |
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | Tom Lane | 2026-04-20 13:42:52 | Re: Test cluster with high OIDs above the signed-int limit (2B+) |