Re: DDL deparse

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: DDL deparse
Date: 2026-06-30 21:38:00
Message-ID: CAD21AoABOvWC8GTDUYvNzTDoxMJtdUSiNzBOg8kd6ao+rEZf+w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 29, 2026 at 11:23 PM Vitaly Davydov
<v(dot)davydov(at)postgrespro(dot)ru> wrote:
>
> On 6/27/26 09:10, Masahiko Sawada wrote:
>
>
> > > I guess, for external clients it would be better
> > > to receive small simple commands instead of a single complex command.
> >
> > Could you elaborate on the reason for this point?
>
> When we speak about logical replication clients that are not relational
> databases, it would be much easier for developers to implement processing
> of received DDL data if it is in a simple and well defined form. If we
> replicate an original query like CTAS, it is hard to process such query on
> the non rdbms client. Another example - SELECT f(), where f() creates a
> number of tables. It is why I assumed that simple DDL queries may be better
> for external clients. I'm sorry, I have no other descriptive examples at
> the moment.

Thank you for the explanation.

>
> I'm still not sure, that splitting a single CREATE COMMAND into a number of
> simpler commands (like CREATE TABLE, ALTER TABLE ADD COLUMN, ADD INDEX, ADD
> CONSTRAINT) may help to developers of non-rdbms clients, but we should not
> send original queries to such non rdbms clients in general.

I personally think we should not split a single CREATE command into
multiple commands. IIUC the reason why PostgreSQL itself splits a
CREATE TABLE into multiple commands is that a consequence of the
object model (a sequence, an index etc.s are independent catalog
objects), not an attempt to give the user a simpler decomposed form.
So I don't think it implies deparse should mirror that decomposition.

For ALTER TABLE specifically, splitting is actively harmful, not just
unnecessary: splitting multiple sub-commands of one ALTER TABLE into
separate statements can turn one rewrite into several. So ALTER TABLE
sub-commands must stay together in one statement.

>
> To be honest, I haven't seen any non RDBMS clients that require DDL. It most
> cases, "Relation" (R) message from logical replication protocol may be enough.
> Not sure, my assumptions are real. It is why I think, that the main use case
> for DDL replication is the major upgrage or migration to another RDBMS.

The Relation message might work as an alternative of CREATE TABLE but
DDL replication would need to support ALTER TABLE as well.

>
>
> > > CREATE OR REPLACE FUNCTION temp_x () RETURNS void
> > > AS $$
> > > DECLARE
> > > _x integer;
> > > BEGIN
> > > CREATE TABLE t_b AS SELECT _x;
> > > END;
> > > $$ LANGUAGE plpgsql;
> >
> > Thank you for the clarification. If we replicate only the top-level
> > DDL, the CREATE TABLE inside the function temp_x() won't be
> > replicated.
>
> I guess, it is ok for another postgresql instance, but it doesn't not
> work for other types of clients in general, because they do not know
> how to process SELECT temp_x().

The built-in logical replication is designed for postgres-to-postgres
replication and it doesn't replicate the SELECT statement. I'm not
sure that the built-in logical replication needs to care about a
heterogeneous environment. As for replicating to another DBMS, event
trigger + DDL deparse would be the right solution.

Having said that, given that it's common to create/alter/drop table
within a SQL function (e.g., extensions managing partitioned tables
etc.), it would be better to replicate DDL commands too that are
executed within a SQL function. But I don't think that we should
replicate DDL commands that are internally generated while executing
another DDL command.

Also, CTAS is the one case that needs special handling, and I'd keep
it consistent with the "don't split" principle rather than as an
exception to it. The key point: deparse still emits a single CREATE
statement and does not generate INSERTs itself. For replication we
want CREATE TABLE + the rows, but the rows come through the normal DML
decoding path, not from deparse -- so deparse isn't splitting
anything. Concretely I'd give two modes: one emits "CREATE TABLE ...
AS ..." faithfully, the other emits just "CREATE TABLE ..." (schema
only); replication uses the latter and the data arrives as DML.

>
> > > (2) do not replicate (or optionally replicate) some
> > > specific settings like toast compression method.
> >
> > How can we draw a line for settings that are replicated or not replicated?
>
> It is a good question, I can't answer to it completely right now. But, we may
> disable replication of settings that belong to the physical storage
> configuration, not logical changes in the schema. These settings may be
> specific for a particular PostgreSQL major version and will not be applicable
> for non RDBMS clients. We may enable replication of such settings optionally,
> when needed.

True, but I think it's better to somehow preserve the user intent
rather than replicating only the pre-selected settings.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Haibo Yan 2026-06-30 21:41:08 Re: Global temporary tables
Previous Message Tom Lane 2026-06-30 21:31:47 Re: Centralised architecture detection