RE: Support logical replication of DDLs

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Runqi Tian <runqidev(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zheng Li <zhengli10(at)gmail(dot)com>
Subject: RE: Support logical replication of DDLs
Date: 2023-04-04 03:13:09
Message-ID: OS0PR01MB571614EE7EEDBF7049DA4EAE94939@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Thursday, March 30, 2023 5:46 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Thu, 30 Mar 2023 at 13:29, houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: houzj(dot)fnst(at)fujitsu(dot)com <houzj(dot)fnst(at)fujitsu(dot)com>
> > > Sent: Thursday, March 30, 2023 2:37 PM
> > >
> > > On Tuesday, March 28, 2023 12:13 PM houzj(dot)fnst(at)fujitsu(dot)com
> > > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > > >
> > > > On Monday, March 27, 2023 8:08 PM Amit Kapila
> > > <amit(dot)kapila16(at)gmail(dot)com>
> > > > wrote:
> > > > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila
> > > > > <amit(dot)kapila16(at)gmail(dot)com>
> > > > > wrote:
> > > > > >
> > > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> wrote:
> > > > > > >
> > > > > >
> > > > > > > I suggest taking a couple of steps back from the minutiae of
> > > > > > > the patch, and spending some hard effort thinking about how
> > > > > > > the thing would be controlled in a useful fashion (that is,
> > > > > > > a real design for the filtering that was mentioned at the
> > > > > > > very outset), and about the security issues, and about how
> > > > > > > we could get to a
> > > committable
> > > > patch.
> > > > > > >
> > > > > >
> > > > > > Agreed. I'll try to summarize the discussion we have till now
> > > > > > on this and share my thoughts on the same in a separate email.
> > > > > >
> > > > >
> > > > > The idea to control what could be replicated is to introduce a
> > > > > new publication option 'ddl' along with current options
> > > > > 'publish' and 'publish_via_partition_root'. The values of this
> > > > > new option could be 'table', 'function', 'all', etc. Here 'all'
> > > > > enables the replication of all supported DDL commands. Example
> usage for this would be:
> > > > > Example:
> > > > > Create a new publication with all ddl replication enabled:
> > > > > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all');
> > > > >
> > > > > Enable table ddl replication for an existing Publication:
> > > > > ALTER PUBLICATION pub2 SET (ddl = 'table');
> > > > >
> > > > > This is what seems to have been discussed but I think we can
> > > > > even extend it to support based on operations/commands, say one
> > > > > would like to publish only 'create' and 'drop' of tables. Then
> > > > > we can extend the existing publish option to have values like 'create',
> 'alter', and 'drop'.
> > > > >
> > > > > Another thing we are considering related to this is at what
> > > > > level these additional options should be specified. We have
> > > > > three variants FOR TABLE, FOR ALL TABLES, and FOR TABLES IN
> > > > > SCHEMA that enables replication. Now, for the sake of
> > > > > simplicity, this new option is discussed to be provided only
> > > > > with FOR ALL TABLES variant but I think we can provide it with
> > > > > other variants with some additional restrictions like with FOR
> > > > > TABLE, we can only specify 'alter' and 'drop' for publish
> > > > > option. Now, though possible, it brings additional complexity to
> > > > > support it with variants other than FOR ALL TABLES because then
> > > > > we need to ensure additional filtering and possible modification
> > > > > of the content we have to send to downstream. So, we can
> > > even
> > > > decide to first support it only FOR ALL TABLES variant.
> > > > >
> > > > > The other point to consider for publish option 'ddl = table' is
> > > > > whether we need to allow replicating dependent objects like say
> > > > > some user-defined type is used in the table. I guess the
> > > > > difficulty here would be to identify which dependents we want to allow.
> > > > >
> > > > > I think in the first version we should allow to replicate only
> > > > > some of the objects instead of everything. For example, can we
> > > > > consider only allowing tables and indexes in the first version?
> > > > > Then extend it in a phased
> > > > manner?
> > > >
> > > > I think supporting table related stuff in the first version makes
> > > > sense and the patch size could be reduced to a suitable size.
> > >
> > > Based on the discussion, I split the patch into four parts: Table
> > > DDL replication(0001,0002), Index DDL replication(0003), ownership
> > > stuff for table and index(0004), other DDL's replication(0005).
> > >
> > > In this version, I mainly tried to split the patch set, and there
> > > are few OPEN items we need to address later:
> > >
> > > 1) The current publication "ddl" option only have two values: table, all. We
> > > also need to add index and maybe other objects in the list.
> > >
> > > 2) Need to improve the syntax stuff. Currently, we store the option value of
> > > the "with (ddl = xx)" via different columns in the catalog, the
> > > catalog(pg_publication) will have more and more columns as we add
> > > support
> > > for logical replication of other objects in the future. We could store it as
> > > an text array instead.
> > >
> > > OTOH, since we have proposed some other more flexible syntax to
> > > -hackers, the current
> > > syntax might be changed which might also solve this problem.
> > >
> > > 3) The test_ddl_deparse_regress test module is not included in the
> > > set, because
> > > I think we also need to split it into table stuff, index stuff and others,
> > > we can share it after finishing that.
> > >
> > > 4) The patch set could be spitted further to make it easier for reviewer like:
> > > infrastructure for deparser, deparser, logical-decoding, built-in logical
> > > replication, We can do it after some analysis.
> > >
> > > [1]
> > >
> https://www.postgresql.org/message-id/OS0PR01MB571646874A3E165D939
> > > 99A9D94889%40OS0PR01MB5716.jpnprd01.prod.outlook.com
> >
> > The patch needs a rebase due to a recent commit da324d6, here is the
> rebased version.
>
> Thanks for the patches, Few comments:
> 1: create unlogged is replicated but the insert on the same is not replicated:
> create unlogged table t3(c1 int); -- The insert on this is not replicated
>
> 2: "Using index tablespace" option is not replicated:
> create table t3 (c1 int unique using index tablespace tbs1);
>
> publisher:
> \d+ t3
> Table "public.t3"
> Column | Type | Collation | Nullable | Default | Storage |
> Compression | Stats target | Description
> --------+---------+-----------+----------+---------+---------+-------------+
> --------------+-------------
> c1 | integer | | | | plain |
> | |
> Indexes:
> "t3_c1_key" UNIQUE CONSTRAINT, btree (c1), tablespace "tbs1"
> Publications:
> "pub1"
> Access method: heap
>
> Subscriber:
> \d+ t3
> Table "public.t3"
> Column | Type | Collation | Nullable | Default | Storage |
> Compression | Stats target | Description
> --------+---------+-----------+----------+---------+---------+-------------+
> --------------+-------------
> c1 | integer | | | | plain |
> | |
> Indexes:
> "t3_c1_key" UNIQUE CONSTRAINT, btree (c1) Access method: heap
>
> 3:The create table is not replicated whereas the drop table of the same is
> replicated with PUBLISH_VIA_PARTITION_ROOT as default (false):
> create table t1(c1 int) partition by hash ( c1); drop table t1;
>
> 4: Should we document tablespace creation should be taken care of by user:
> create table t1(c1 int) tablespace tbs1; -- As tablespace tbs1 does not exist in the
> subscriber, should we add some documentation for this.
>
> 5: temporary table is not replicated, should we add some documentation for
> this:
> create global temporary table t2(c1 int); -- Should we add some
> documentation for this

Thanks for the comments.

Attach the new version patch set which did the following changes:

1. Allow DDL replication to be used with FOR TABLES IN SCHEMA.
2. Support the index option in the publication ddl in 0004 patch and slightly
refactor the code in pgoutput.
3. Address the comments in [1]. (Thanks Vignesh for helping address the comments)

BTW, while reviewing the patch, I find another problem:

Currently, When DDL is executed, we only choose one DDL event trigger to fire ,
this was done to avoid duplicate deparsing and WAL if user create multiple
publications with multiple event triggers. But this has a problem if user
create multiple publications with different ddl option:

-- This create ddl trigger A which catch and WAL table ddl command
CREATE PUBLICATION A FOR ALL TABLES with (ddl = 'table');

-- This create ddl trigger A which catch and WAL index ddl command
CREATE PUBLICATION B FOR ALL TABLES with (ddl = 'index');

If we only choose one event trigger to fire, we will miss WAL logging one kind
of DDLs.

I am thinking maybe we need to let the ddl trigger catch all the supported ddl
commands(it only catch the DDL that specified in the 'ddl' option in the
current patch), and teach the ddl trigger function to collect the all the
publication information to check if the current DDL is published in any
publications, and if so, deparse and WAL log it. This way, even if we only
choose one event trigger, that trigger can deparse and WAL log the necessary
DDL.

[1] https://www.postgresql.org/message-id/CALDaNm2vBN8oMv-7G%3DDH5rR-u40JGbR9aP4B6nwr71qw17rPFA%40mail.gmail.com

Best Regards,
Hou zj

Attachment Content-Type Size
0005-Apply-the-DDL-change-as-that-same-user-th-2023_04_04.patch application/octet-stream 54.1 KB
0006-Deparser-and-DDL-replication-for-the-rest-2023_04_04.patch application/octet-stream 342.3 KB
0001-Deparser-for-Table-DDL-commands-and-exten-2023_04_04.patch application/octet-stream 158.1 KB
0002-DDL-replication-for-Table-DDL-commands-2023_04_04.patch application/octet-stream 237.6 KB
0003-Deparser-for-INDEX-DDL-commands-2023_04_04.patch application/octet-stream 15.3 KB
0004-DDL-replication-for-index-DDL-commands-2023_04_04.patch application/octet-stream 57.7 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2023-04-04 03:21:44 RE: Support logical replication of DDLs
Previous Message jian he 2023-04-04 01:50:42 Re: ​jsonb @@ jsonpath operator doc: ​Only the first item of the result is taken into account

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-04-04 03:13:12 Re: [BUG] pg_stat_statements and extended query protocol
Previous Message Justin Pryzby 2023-04-04 03:04:02 Re: zstd compression for pg_dump