Re: DDL deparse

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: DDL deparse
Date: 2026-06-26 06:45:14
Message-ID: CAD21AoAkTGcdCHiCOj7_OiT5m4FTCGmO4c1awRdN=-6gi6GN_Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 25, 2026 at 1:12 AM Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> wrote:
>
> Dear Hackers,
>
> > So my question is: should DDL deparse preserve user intent, or is it
> > acceptable to materialize the catalog state (in which case, should we
> > just extend pg_get_table_ddl instead)?
>
> I think, it depends on the usage scenarios. There are some use cases of:
> major upgrade, replication to other databases, replication to other
> cliends, DDL mining from WAL, logging and audit. In case of replication to
> other databases, splitting a complex command to a number of simple commands
> that materialize the same schema looks ok. In case of DDL mining from WAL
> or logging and audit scenarios it would be preferable to keep user intent.

Agreed. I think that generating DDL while keeping the user intent
would cover both cases.

ISTM that it's a balanced DDL deparse policy that generates DDLs from
parse tree + system catalogs while normalizing the query and
preserving user intent but not expand one command into multiple
commands. It might need special handlings for CTAS and LIKE clauses in
CREATE TABLE, though.

> Sorry for off-topic, but I'm still not sure that deparsing the original DDL
> command from the parse tree is the best approach. Below are some problem
> queries for deparsing original DDL from the parse tree.
>
> Query 1: CTAS with function
>
> CREATE OR REPLACE FUNCTION f1()
> RETURNS integer
> AS $$
> CREATE TABLE t2(c INT);
> SELECT 10;
> $$
> LANGUAGE SQL;
>
> CREATE TABLE t AS SELECT f1();
>
> Query 2: CTAS with a temp table
>
> CREATE TABLE t AS SELECT * FROM temptable;

I think CTAS needs special handling for DDL replication purposes; it
can be replicated as CREATE TABLE + INSERT. However, if DDL deparse
expands CTAS into CREATE TABLE + INSERT, it might be against the
policy that DDL deparse would preserve user intent. So an idea is that
DDL deparse has two output modes for CTAS: one generates CREATE TABLE
... AS to preserve the user intent (for audit cases etc.) and another
one generates CREATE TABLE. DDL replication would use the latter mode
to capture the CREATE TABLE.

>
> Query 3: CTAS with a function parameter
>
> CREATE TABLE t AS SELECT @mypar

This is not supported in PostgreSQL.

>
> Decoding final changes in the WAL seems to be the more suitable approach
> for such cases.
>
> Adding a new data into the WAL like a new record type with original DDL for
> for logical decoding seems redundant. Decoding system catalog changes from
> the WAL with some new hints may help to decode the commands like
> ALTER TABLE ALTER COLUMN TYPE USING. Defining user command boundaries in the
> WAL may help to decode the original command as well.

Are you referring to the idea of replicating system catalog changes as
DML to the subscriber? I'm not sure it works in cross-version
replication setup without some form of intermediate representation for
object definitions.

Or you might mean generating DDL from the WAL records of system
catalog changes. IIUC it reconstructs the system catalog state as DDL
and so loses the user intent. For instance, in v19 the default TOAST
compression became lz4, while it was pglz before. If we had DDL
replication in v18 and a user set up a logical replication v18 -> v19,
I think they would not want new tables on v19 to be forced to pglz
just because that was the publisher's default. but if they
deliberately chose pglz, they'd want it preserved. We cannot know
whether a user explicitly specified pglz when creating the table by
just looking at the system catalogs.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Adam Lee 2026-06-26 06:45:57 CLUSTER progress: wrong index_rebuild_count for tables with TOAST
Previous Message Richard Guo 2026-06-26 06:26:20 Re: Add enable_groupagg GUC parameter to control GroupAggregate usage