| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> |
| Cc: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: DDL deparse |
| Date: | 2026-06-27 06:10:56 |
| Message-ID: | CAD21AoBCXrzhzLFoT=R_qRcBVUG-hqp8QyHOfTMynv6srggiCQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Jun 26, 2026 at 3:59 AM Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> wrote:
>
> Dear Masahiko Sawada, All
>
> > Agreed. I think that generating DDL while keeping the user intent
> > would cover both cases.
>
> > I think CTAS needs special handling for DDL replication purposes; it
> > can be replicated as CREATE TABLE + INSERT. However, if DDL deparse
> > expands CTAS into CREATE TABLE + INSERT, it might be against the
> > policy that DDL deparse would preserve user intent. So an idea is that
> > DDL deparse has two output modes for CTAS: one generates CREATE TABLE
> > ... AS to preserve the user intent (for audit cases etc.) and another
> > one generates CREATE TABLE. DDL replication would use the latter mode
> > to capture the CREATE TABLE.
>
> Some more thoughts...
>
> Scenarios, that require to keep the original query, may relate to other
> types of solutions, than DDL replication. For example, logging and auditing
> solutions do not require DDL to be applied. Solutions may just log the
> original query unchanged. There is no need to support such solutions
> in the core. Utility hook seems to be the right choice.
>
> Solutions, like WAL miners, may require the original query if a DBA
> wants to look closely on some WAL changes, but it is not a strict
> requirement if such solutions are used for CDC purposes.
Fair point.
> For replication purpose we must guarantee that the replica (or an
> external client) gets the right set of commands, that reproduce the
> schema on the master. A number of simple commands instead a single
> command looks ok.
Right.
> I guess, for external clients it would be better
> to receive small simple commands instead of a single complex command.
Could you elaborate on the reason for this point?
> I guess, we may optionally keep both the original query and the final
> set of simple DDL commands suitable for replication purposes.
Logging, auditing, and WAL-mining are fine with the original query and
can be handled outside the core -- none of them need an
intent-preserving deparse in core. That leaves DDL replication as the
single use case for an intent-preserving DDL deparse. Expanding one
command into several is fine for DDL replication, but knowing what the
user specified versus what was left to the default is something
replication does need. And that is what pg_get_table_ddl() cannot
provide. I thought DDL deparse should not expand one command into
several, partly because that property would make it useful beyond
replication and help justify building it. But that might be
optimistic. Other use cases don't really need it, and expanding or not
doesn't matter much for DDL replication.
> Furthermore, an original user query like SELECT f() may be finally
> represented in the form of CREATE TABLE, if f() creates a new table.
> Which query should we get for logging or auditing purposes?
I'm thinking of logging only the top-level query.
>
> >>
> >> Query 3: CTAS with a function parameter
> >>
> >> CREATE TABLE t AS SELECT @mypar
> > This is not supported in PostgreSQL.
>
> Sorry, I mean a query like below, where a function variable is used:
>
> CREATE OR REPLACE FUNCTION temp_x () RETURNS void
> AS $$
> DECLARE
> _x integer;
> BEGIN
> CREATE TABLE t_b AS SELECT _x;
> END;
> $$ LANGUAGE plpgsql;
Thank you for the clarification. If we replicate only the top-level
DDL, the CREATE TABLE inside the function temp_x() won't be
replicated.
>
> > Or you might mean generating DDL from the WAL records of system
> > catalog changes. IIUC it reconstructs the system catalog state as DDL
> > and so loses the user intent. For instance, in v19 the default TOAST
> > compression became lz4, while it was pglz before. If we had DDL
> > replication in v18 and a user set up a logical replication v18 -> v19,
> > I think they would not want new tables on v19 to be forced to pglz
> > just because that was the publisher's default. but if they
> > deliberately chose pglz, they'd want it preserved. We cannot know
> > whether a user explicitly specified pglz when creating the table by
> > just looking at the system catalogs.
>
> I mean to decode DDL changes from the WAL on the master and convert it
> into a suitable form (sql or json) before sending to a replica or a
> client.
>
> Thank you for mentioning such example. I agree, it may be hard to define,
> whether a specific setting was explicitly set by user or not. There are
> some possible solutions may be like: (1) ignore the setting if it has
> a default value,
I guess it's hard to determine where the default setting came from; it
might be because it was not specified in the query or the user
explicitly specified the default value.
> (2) do not replicate (or optionally replicate) some
> specific settings like toast compression method.
How can we draw a line for settings that are replicated or not replicated?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Xuneng Zhou | 2026-06-27 06:11:07 | Re: Deadlock detector fails to activate on a hot standby replica |
| Previous Message | Bharath Rupireddy | 2026-06-27 05:49:00 | Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE |