Re: Support logical replication of DDLs

From: Zheng Li <zhengli10(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support logical replication of DDLs
Date: 2022-06-22 22:22:00
Message-ID: CAAD30U+76WKKuMzNWt6uS48P7yUku+feBgtoFw9juYGYBtEweQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

> Here are some points in my mind about the two approaches discussed here.
>
> 1) search_patch vs schema qualify
>
> Again, I still think it will bring more flexibility and security by schema qualify the
> objects in DDL command as mentioned before[1].

I wonder what security concerns you have? We certainly don't want to
log the search_path
if there are serious security issues.

> Besides, a schema qualified DDL is also more appropriate for other use
> cases(e.g. a table-level replication). As it's possible the schema is different
> between pub/sub and it's easy to cause unexpected and undetectable failure if
> we just log the search_path.
>
> It makes more sense to me to have the same style WAL log(schema qualified) for
> both database level or table level replication as it will bring more
> flexibility.

I think it's reasonable to consider using different formats for the
two different use cases.
Especially if the space and time overhead of the deparser format
sticks out. I also don't
think we need to use the deparser for global objects DDL such as ROLE
statements because
no schema qualification is needed. Also another issue with ROLE
statements is that they
are not captured by event triggers currently.

> > "Create Table As .." is already handled by setting the skipData flag of the
> > statement parsetreee before replay:
>
> 2) About the handling of CREATE TABLE AS:
>
> I think it's not a appropriate approach to set the skipdata flag on subscriber
> as it cannot handle EXECUTE command in CTAS.
>
> CREATE TABLE q5_prep_results AS EXECUTE q5(200, 'DTAAAA');
>
> The Prepared statement is a temporary object which we don't replicate. So if
> you directly execute the original SQL on subscriber, even if you set skipdata
> it will fail.
>
> I think it difficult to make this work as you need handle the create/drop of
> this prepared statement. And even if we extended subscriber's code to make it
> work, it doesn't seems like a standard and elegant approach.

This is indeed an interesting case, thanks for pointing this out. One
light weight solution
I can think of is to directly deparse the parsetree on the publisher
into a simple CREATE TABLE statement
without the prepared statement and then replicate the simple CREATE
TABLE statement .
This doesn't have to involve the json format though.

> > "Alter Table .. " that rewrites with volatile expressions can also be handled
> > without any syntax change, by enabling the table rewrite replication and
> > converting the rewrite inserts to updates. ZJ's patch introduced this solution.
>
> 3) About the handling of ALTER TABLE rewrite.
>
> The approach I proposed before is based on the event trigger + deparser
> approach. We were able to improve that approach as we don't need to replicate
> the rewrite in many cases. For example: we don't need to replicate rewrite dml
> if there is no volatile/mutable function. We should check and filter these case
> at publisher (e.g. via deparser) instead of checking that at subscriber.

Surely we can make the check about volatile/mutable functions on the
publisher side
as well. It doesn't have to be done via the deparser.

> Besides, as discussed, we need to give warning or error for the cases when DDL
> contains volatile function which would be executed[2]. We should check this at
> publisher as well(via deparser).

Again, I think the check doesn't have to be done via the deparser.

> > I think for such cases it's not full database replication and we could treat it as
> > table level DDL replication, i.e. use the the deparser format.
>
> 4) I think the point could be that we should make the WAL log format extendable
> so that we can extend it to support more useful feature(table filter/schema
> maps/DDL filter). If we just WAL log the original SQL, it seems it's difficult
> to extend it in the future ?

My point is that for full replication/version upgrade use cases we
don't need to worry about extending
it for features such as schema mapping. Because such use cases
naturally want to keep identical schema
structures.

With Regards,
Zheng

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter J. Holzer 2022-06-22 22:25:58 Re: Tuning a query with ORDER BY and LIMIT
Previous Message Peter J. Holzer 2022-06-22 22:19:19 Re: Tuning a query with ORDER BY and LIMIT

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2022-06-22 22:35:18 Re: Tightening behaviour for non-immutable behaviour in immutable functions
Previous Message Andres Freund 2022-06-22 21:39:43 Re: SLRUs in the main buffer pool - Page Header definitions