Re: Support logical replication of DDLs

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "Wei Wang (Fujitsu)" <wangw(dot)fnst(at)fujitsu(dot)com>, Runqi Tian <runqidev(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zheng Li <zhengli10(at)gmail(dot)com>
Subject: Re: Support logical replication of DDLs
Date: 2023-06-09 05:38:42
Message-ID: CAA4eK1+XTupKt-29Y-LLcvgpMkRgMi7Q=tByQMOB3=7yfkJQMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Fri, Jun 9, 2023 at 5:35 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Jun 8, 2023 at 9:12 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> >
> > Please find new set of patches addressing below:
> > a) Issue mentioned by Wang-san in [1],
> > b) Comments from Peter given in [2]
> > c) Comments from Amit given in the last 2 emails.
> >
> > [1]: https://www.postgresql.org/message-id/OS3PR01MB62750D43D4F7F075B33BD2609E52A%40OS3PR01MB6275.jpnprd01.prod.outlook.com
> > [2]: https://www.postgresql.org/message-id/CAHut%2BPv9vPbUQc0fzrKmDkKOsS_bj-hup_E%2BsLHNEX%2B6F%2BSY5Q%40mail.gmail.com
> >
> > Thank You Vignesh for handling (a), Ajin for handling (b), Shi-san and
> > Hou-san for contributing in (c).
> >
>
> I have some questions about DDL deparser:
>
> I've tested deparsed and reformed DDL statements with the following
> function and event trigger borrowed from the regression tests:
>
> CREATE OR REPLACE FUNCTION test_ddl_deparse()
> RETURNS event_trigger LANGUAGE plpgsql AS
> $$
> DECLARE
> r record;
> deparsed_json text;
> BEGIN
> FOR r IN SELECT * FROM pg_event_trigger_ddl_commands()
> -- Some TABLE commands generate sequence-related
> commands, also deparse them.
> WHERE command_tag in ('ALTER FOREIGN TABLE', 'ALTER TABLE',
> 'CREATE
> FOREIGN TABLE', 'CREATE TABLE',
> 'CREATE
> TABLE AS', 'DROP FOREIGN TABLE',
> 'DROP
> TABLE', 'ALTER SEQUENCE',
> 'CREATE
> SEQUENCE', 'DROP SEQUENCE')
> LOOP
> deparsed_json = pg_catalog.ddl_deparse_to_json(r.command);
> RAISE NOTICE 'deparsed json: %', deparsed_json;
> RAISE NOTICE 're-formed command: %',
> pg_catalog.ddl_deparse_expand_command(deparsed_json);
> END LOOP;
> END;
> $$;
>
> CREATE EVENT TRIGGER test_ddl_deparse
> ON ddl_command_end EXECUTE PROCEDURE test_ddl_deparse();
>
> The query "CREATE TABLE TEST AS SELECT 1" is deparsed and reformed by
> DDL deparse to:
>
> CREATE TABLE public.test ("?column?" pg_catalog.int4 STORAGE PLAIN )
>
> I agree that we need to deparse CTAS in such a way for logical
> replication but IIUC DDL deparse feature (e.g., ddl_deparse_to_json()
> and ddl_deparse_expand_command()) is a generic feature in principle so
> I'm concerned that there might be users who want to get the DDL
> statement that is actually executed. If so, we might want to have a
> switch to get the actual DDL statement instead.
>

I agree with an additional switch here but OTOH I think we should
consider excluding CTAS and SELECT INTO in the first version to avoid
further complications to a bigger patch. Let's just do it for
CREATE/ALTER/DROP table.

> Also, the table and column data type are schema qualified, but is
> there any reason why the reformed query doesn't explicitly specify
> tablespace, toast compression and access method? Since their default
> values depend on GUC parameters, the table could be created in a
> different tablespace on the subscriber if the subscriber set a
> different value to default_tablespace GUC parameter. IIUC the reason
> why we use schema-qualified names instead of sending along with
> search_path is to prevent tables from being created in a different
> schema depending on search_patch setting on the subscriber.
>

I think we do send schema name during DML replication as well, so
probably doing the same for DDL replication makes sense from that
angle as well. The other related point is that apply worker always set
search_path as an empty string.

> So I
> wondered why we don't do that for other GUC-depends propety.
>

Yeah, that would probably be a good idea but do we want to do it in
default mode? I think if we want to optimize the default mode such
that we only WAL log the DDL with user-specified options then
appending options for default GUCs would make the string to be WAL
logged unnecessarily long. I am thinking that in default mode we can
allow subscribers to perform operations with their default respective
GUCs.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Rama Krishnan 2023-06-09 05:51:20 How to store query result into another table using stored procedure
Previous Message Nim Li 2023-06-09 03:21:28 Question about where to deploy the business logics for data processing

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-06-09 05:58:42 Re: Introduce WAIT_EVENT_EXTENSION and WAIT_EVENT_BUFFER_PIN
Previous Message Masahiko Sawada 2023-06-09 05:16:44 Skip collecting decoded changes of already-aborted transactions