RE: Support logical replication of DDLs

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Runqi Tian <runqidev(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zheng Li <zhengli10(at)gmail(dot)com>
Subject: RE: Support logical replication of DDLs
Date: 2023-03-30 07:59:34
Message-ID: OS0PR01MB57169AB2355A90C9FA87DE75948E9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

> -----Original Message-----
> From: houzj(dot)fnst(at)fujitsu(dot)com <houzj(dot)fnst(at)fujitsu(dot)com>
> Sent: Thursday, March 30, 2023 2:37 PM
>
> On Tuesday, March 28, 2023 12:13 PM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Monday, March 27, 2023 8:08 PM Amit Kapila
> <amit(dot)kapila16(at)gmail(dot)com>
> > wrote:
> > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > > wrote:
> > > >
> > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > > >
> > > >
> > > > > I suggest taking a couple of steps back from the minutiae of the
> > > > > patch, and spending some hard effort thinking about how the thing
> > > > > would be controlled in a useful fashion (that is, a real design
> > > > > for the filtering that was mentioned at the very outset), and
> > > > > about the security issues, and about how we could get to a
> committable
> > patch.
> > > > >
> > > >
> > > > Agreed. I'll try to summarize the discussion we have till now on
> > > > this and share my thoughts on the same in a separate email.
> > > >
> > >
> > > The idea to control what could be replicated is to introduce a new
> > > publication option 'ddl' along with current options 'publish' and
> > > 'publish_via_partition_root'. The values of this new option could be
> > > 'table', 'function', 'all', etc. Here 'all' enables the replication of
> > > all supported DDL commands. Example usage for this would be:
> > > Example:
> > > Create a new publication with all ddl replication enabled:
> > > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all');
> > >
> > > Enable table ddl replication for an existing Publication:
> > > ALTER PUBLICATION pub2 SET (ddl = 'table');
> > >
> > > This is what seems to have been discussed but I think we can even
> > > extend it to support based on operations/commands, say one would like
> > > to publish only 'create' and 'drop' of tables. Then we can extend the
> > > existing publish option to have values like 'create', 'alter', and 'drop'.
> > >
> > > Another thing we are considering related to this is at what level
> > > these additional options should be specified. We have three variants
> > > FOR TABLE, FOR ALL TABLES, and FOR TABLES IN SCHEMA that enables
> > > replication. Now, for the sake of simplicity, this new option is
> > > discussed to be provided only with FOR ALL TABLES variant but I think
> > > we can provide it with other variants with some additional
> > > restrictions like with FOR TABLE, we can only specify 'alter' and
> > > 'drop' for publish option. Now, though possible, it brings additional
> > > complexity to support it with variants other than FOR ALL TABLES
> > > because then we need to ensure additional filtering and possible
> > > modification of the content we have to send to downstream. So, we can
> even
> > decide to first support it only FOR ALL TABLES variant.
> > >
> > > The other point to consider for publish option 'ddl = table' is
> > > whether we need to allow replicating dependent objects like say some
> > > user-defined type is used in the table. I guess the difficulty here
> > > would be to identify which dependents we want to allow.
> > >
> > > I think in the first version we should allow to replicate only some of
> > > the objects instead of everything. For example, can we consider only
> > > allowing tables and indexes in the first version? Then extend it in a phased
> > manner?
> >
> > I think supporting table related stuff in the first version makes sense and the
> > patch size could be reduced to a suitable size.
>
> Based on the discussion, I split the patch into four parts: Table DDL
> replication(0001,0002), Index DDL replication(0003), ownership stuff for table
> and index(0004), other DDL's replication(0005).
>
> In this version, I mainly tried to split the patch set, and there are few
> OPEN items we need to address later:
>
> 1) The current publication "ddl" option only have two values: table, all. We
> also need to add index and maybe other objects in the list.
>
> 2) Need to improve the syntax stuff. Currently, we store the option value of
> the "with (ddl = xx)" via different columns in the catalog, the
> catalog(pg_publication) will have more and more columns as we add
> support
> for logical replication of other objects in the future. We could store it as
> an text array instead.
>
> OTOH, since we have proposed some other more flexible syntax to -hackers,
> the current
> syntax might be changed which might also solve this problem.
>
> 3) The test_ddl_deparse_regress test module is not included in the set,
> because
> I think we also need to split it into table stuff, index stuff and others,
> we can share it after finishing that.
>
> 4) The patch set could be spitted further to make it easier for reviewer like:
> infrastructure for deparser, deparser, logical-decoding, built-in logical
> replication, We can do it after some analysis.
>
> [1]
> https://www.postgresql.org/message-id/OS0PR01MB571646874A3E165D939
> 99A9D94889%40OS0PR01MB5716.jpnprd01.prod.outlook.com

The patch needs a rebase due to a recent commit da324d6, here is the rebased version.

Best Regards,
Hou zj

Attachment Content-Type Size
0005-Deparser-and-DDL-replication-for-the-rest-2023_03_30.patch application/octet-stream 330.9 KB
0001-Deparser-for-Table-DDL-commands-and-exten-2023_03_30.patch application/octet-stream 157.4 KB
0002-DDL-replication-for-Table-DDL-commands-2023_03_30.patch application/octet-stream 234.1 KB
0003-Deparser-for-INDEX-DDL-commands-2023_03_30.patch application/octet-stream 14.5 KB
0004-Apply-the-DDL-change-as-that-same-user-th-2023_03_30.patch application/octet-stream 54.1 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dominique Devienne 2023-03-30 08:01:29 Re: Using CTID system column as a "temporary" primary key
Previous Message Sebastien Flaesch 2023-03-30 06:54:39 Re: Using CTID system column as a "temporary" primary key

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2023-03-30 08:00:29 Re: [EXTERNAL] Support load balancing in libpq
Previous Message Julien Rouhaud 2023-03-30 07:57:22 Re: [POC] Allow an extension to add data into Query and PlannedStmt nodes