Re: Support logical replication of DDLs

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ajin Cherian <itsajin(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Runqi Tian <runqidev(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zheng Li <zhengli10(at)gmail(dot)com>
Subject: Re: Support logical replication of DDLs
Date: 2023-04-26 09:33:17
Message-ID: CALDaNm2WZ1Dif2gqoso92e5rfSRf1p99i=R280ODLoDJvPTo_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Wed, 26 Apr 2023 at 12:02, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Mar 28, 2023 at 3:22 PM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Tuesday, March 28, 2023 1:41 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Mar 27, 2023 at 5:37 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > > wrote:
> > > >
> > > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > > wrote:
> > > > >
> > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > > > >
> > > > >
> > > > > > I suggest taking a couple of steps back from the minutiae of the
> > > > > > patch, and spending some hard effort thinking about how the thing
> > > > > > would be controlled in a useful fashion (that is, a real design
> > > > > > for the filtering that was mentioned at the very outset), and
> > > > > > about the security issues, and about how we could get to a committable
> > > patch.
> > > > > >
> > > > >
> > > > > Agreed. I'll try to summarize the discussion we have till now on
> > > > > this and share my thoughts on the same in a separate email.
> > > > >
> > > >
> > > > The idea to control what could be replicated is to introduce a new
> > > > publication option 'ddl' along with current options 'publish' and
> > > > 'publish_via_partition_root'. The values of this new option could be
> > > > 'table', 'function', 'all', etc. Here 'all' enables the replication of
> > > > all supported DDL commands. Example usage for this would be:
> > > > Example:
> > > > Create a new publication with all ddl replication enabled:
> > > > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all');
> > > >
> > > > Enable table ddl replication for an existing Publication:
> > > > ALTER PUBLICATION pub2 SET (ddl = 'table');
> > > >
> > > > This is what seems to have been discussed but I think we can even
> > > > extend it to support based on operations/commands, say one would like
> > > > to publish only 'create' and 'drop' of tables. Then we can extend the
> > > > existing publish option to have values like 'create', 'alter', and
> > > > 'drop'.
> > > >
> > >
> > > The other idea could be to that for the new option ddl, we input command tags
> > > such that the replication will happen for those commands.
> > > For example, ALTER PUBLICATION pub2 SET (ddl = 'Create Table, Alter
> > > Table, ..'); This will obviate the need to have additional values like 'create', 'alter',
> > > and 'drop' for publish option.
> > >
> > > The other thought related to filtering is that one might want to filter DDLs and
> > > or DMLs performed by specific roles in the future. So, we then need to
> > > introduce another option ddl_role, or something like that.
> > >
> > > Can we think of some other kind of filter for DDL replication?
> >
> > I am thinking another generic syntax for ddl replication like:
> >
> > --
> > CREATE PUBLICATION pubname FOR object_type object_name with (publish = 'ddl_type');
> > --
> >
> > To replicate DDLs that happened on a table, we don't need to add new syntax or
> > option, we can extend the value for the 'publish' option like:
> >
> > To support more non-table objects replication, we can follow the same style and write it like:
> > --
> > CRAETE PUBLICATION FOR FUNCTION f1 with (publish = 'alter'); -- function
> > CRAETE PUBLICATION FOR ALL OPERATORS IN SCHEMA op_schema with (publish = 'drop'); -- operators
> > CRAETE PUBLICATION FOR ALL OBJECTS with (publish = 'alter, create, drop'); -- everything
> > --
> >
> > In this approach, we extend the publication grammar and users can
> > filter the object schema, object name, object type and ddltype. We can also add
> > more options to filter role or other infos in the future.
>
> In this approach, does the subscriber need to track what objects have
> been subscribed similar to tables? For example, suppose that we
> created a publication for function func1 and created a subscription
> for the publication. What if we add function func2 to the publication?
> If we follow the current behavior, DDLs for func2 will be replicated
> to the subscriber but the subscriber won't apply it unless we refresh
> the publication of the subscription. So it seems to me that the
> subscriber needs to have a list of subscribed functions, and we will
> end up having lists of all types of objects.
>
> >
> > ~~~~
> >
> > One more alternative could be like:
> >
> > One more alternative could be like:
> > CREATE PUBLICATION xx FOR pub_create_alter_table WITH (ddl = 'table:create,alter'); -- it will publish create table and alter table operations.
> > CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table:all'); -- This means all table operations create/alter/drop
> > CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table'); -- same as above
> >
> > This can be extended later to:
> > CREATE PUBLICATION xx FOR pub_all_func WITH (ddl = 'function:all');
> > CREATE PUBLICATION xx FOR pub_create_trigger (ddl = 'trigger:create');
> >
> > In this approach, we don't need to add more stuff in gram.y and
> > will give fine-grained control as well.
>
> What did you mean by pub_create_alter_table, pub_all_table,
> pg_all_func, and pub_create_trigger? Are they table names or some
> special keywords indicating groups of objects?

Those were table names, I'm slightly modifying the examples to make it
more clearer:
ex 1: CREATE PUBLICATION pub1 FOR table employee, department, sales
WITH (ddl = 'table:create,alter');
In this I meant that create/alter operation on table employee,
department and sales will be replicated to the subscriber.

ex 2: CREATE PUBLICATION pub2 FOR table employee, department, sales
WITH (ddl = 'table:all');
In this I meant that it will replicate all ddl operations(create/alter
and drop) on table employee, department and sales to the subscriber

ex 3: CREATE PUBLICATION pub3 FOR table employee, department, sales
WITH (ddl = 'table');
This is same as example 2, only advantage here is that user need not
specify ":all" along with the syntax, by default it will replicate all
the table operations create/alter and drop.

The create publication syntax will be the same existing syntax, only
change is the addition of the following in the with option:
ddl = table
ddl = table:all
ddl = table:create|alter|drop

Regards,
Vignesh

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2023-04-26 11:31:36 RE: Support logical replication of DDLs
Previous Message Amit Kapila 2023-04-26 08:56:50 Re: Support logical replication of DDLs

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2023-04-26 09:47:14 Re: pg_stat_io for the startup process
Previous Message vignesh C 2023-04-26 09:12:06 Re: Add two missing tests in 035_standby_logical_decoding.pl