RE: Support logical replication of DDLs

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Zheng Li <zhengli10(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
Subject: RE: Support logical replication of DDLs
Date: 2022-06-09 11:43:51
Message-ID: OS0PR01MB5716733D6C79D2198E5773CC94A79@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hi,

I did some research for one potential problem[1] mentioned earlier which is related
to the function execution when replicating DDL.

[1]> 4. Statements that have nondeterministic side effects (e.g., as caused
> by triggers, stored procedures, user-defined functions) may result in
> different side effects occurring on each subscriber.
>
> Think how to handle triggers and functions with same name but different
> purpose.
>

Examples:
ALTER TABLE ADD CONSTRAINT func()
ALTER TABLE ADD COLUMN DEFAULT func()
CREATE TRIGGER ... execute procedure func()
...

When replication the above DDLs, there are some cases we need to think about.

----------------
1) The functions used in DDL have same definition among pub/sub.

In this case, if the functions include DML/DDL operations, it will result in
unexpected behavior.

For example:

--- both pub and sub have the same function test_func().
create function test_func()
returns integer as '
begin
CREATE TABLE test(a int);
INSERT INTO test values(1);
return 0;
end
' language plpgsql;

--- We replicate the following DDL
ALTER TABLE t ADD COLUMN a int DEFAULT test_func();

There are three SQLs that would be replicated to the subscriber:
CREATE TABLE test(a int);
INSERT(1);
ALTER TABLE t ADD COLUMN a int DEFAULT test_func();

Then, we would create the table "test" twice and insert same value twice(first by
executing DEFAULT function, second by replaying the CREATE TABLE and INSERT
command) on subcriber.

One possible idea is that we only allow replicating 'immutable/stable' function
which won't include write action so that we don't have this problem. But it
seems a bit restrictive.

----------------

2) The functions used in DDL have different definitions among pub/sub.

I am not sure how to handle this case as this could result in unpredictable
behavior based on the different definitions. And it's difficult to compare the
function definition among pub/sub because the function could call other
functions nested.

OTOH, the current behavior of this case on HEAD is that we don't check the
consistency of the functions executed on pub/sub. For example: the row triggers
will always be fired on subscriber(when enabled) without checking if the same
trigger exists on publisher. So, it might be acceptable if we document this
restriction, although I am not sure.

*******************

- About the solution for the above two points. I think one solution could be:

We can document that user should make sure the DDL to be replicated should not
execute any function which could execute DML/DDL. This seems acceptable as it's
a pretty rare case to execute DML/DDL in a CONSTRAINT function or DEFAULT
function. And the document[1] already suggest similar thing for CONSTRAINT
function. Besides, we can also document that and the functions should be
defined in a consistent way among pub/sub.

[1] https://www.postgresql.org/docs/devel/ddl-constraints.html
> PostgreSQL assumes that CHECK constraints' conditions are immutable, that is,
> they will always give the same result for the same input row. This assumption
> is what justifies examining CHECK constraints only when rows are inserted or
> updated, and not at other times. (The warning above about not referencing
> other table data is really a special case of this restriction.)

- Another solution could be:

We coud introduce a new function flag called (replica_safety) and the values
could be 'safe'/'unsafe'. 'safe' indicates that it's safe to be replicated to
subscriber and it's safe to be executed when replay the DDL from other
publisher.

When replicating the DDL on publisher, we first check if the functions used in
DDL are replica_safe, if not, we don't replicate that DDL. If yes, before
replicating the DDL, we first send a new type FUNCTIONS message which include
functions(used in DDL or ..) information to the subscriber which is similar to
the RELATION messages. And on subscriber, we check if the received functions
exist and if they are also marked replica safe on subscriber. if functions are
not replica safe on sub, we report an error and suggest user to adjust the
problematic function. I haven't tried to implement it so I could miss
something. Thoughts ?

BTW, attach the POC patch set which only fixed a CFbot error and
added some testcases provided by Osumi-san.

Best regards,
Hou zj

Attachment Content-Type Size
v8-0004-Test-cases-for-DDL-replication.patch application/octet-stream 13.2 KB
v8-0001-Functions-to-deparse-DDL-commands.patch application/octet-stream 85.8 KB
v8-0002-Support-DDL-replication.patch application/octet-stream 124.7 KB
v8-0003-support-CREATE-TABLE-A-ELECT-INTO.patch application/octet-stream 14.1 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Shaozhong SHI 2022-06-09 13:37:27 A function to find errors in groups in a table
Previous Message Lucas 2022-06-09 09:10:53 Cluster OID Limit

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Borisov 2022-06-09 11:45:55 Re: Add 64-bit XIDs into PostgreSQL 15
Previous Message Jakub Wartak 2022-06-09 11:23:36 RE: pgcon unconference / impact of block size on performance