Re: Support logical replication of DDLs

From: Zheng Li <zhengli10(at)gmail(dot)com>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>
Subject: Re: Support logical replication of DDLs
Date: 2022-04-29 17:34:32
Message-ID: CAAD30ULUBpNegQGEx4_CO-eoz4Z1eYmfOKOTPC7ajmT5_n1mVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hello,

> You should forbid it. Unless you can decompose the command into multiple SQL
> commands to make it a safe operation for logical replication.
>
> Let's say you want to add a column with a volatile default.
>
> ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random();
>
> If you replicate the DDL command as is, you will have different data
> downstream. You should forbid it. However, this operation can be supported if
> the DDL command is decomposed in multiple steps.
>
> -- add a new column without DEFAULT to avoid rewrite
> ALTER TABLE foo ADD COLUMN bar double precision;
>
> -- future rows could use the DEFAULT expression
> -- it also doesn't rewrite the table
> ALTER TABLE foo ALTER COLUMN bar SET DEFAULT random();
>
> -- it effectively rewrites the table
> -- all rows are built from one source node
> -- data will be the same on all nodes
> UPDATE foo SET bar = random();

I looked more into this. In order to support statements like "ALTER
TABLE foo ADD COLUMN bar double precision DEFAULT random();", we have
two potential solutions, but both of them are non-trivial to
implement:

1. As Euler pointed out, we could decompose the statement on the
publisher into multiple statements so that the table rewrite (using
volatile function) is handled by a DML sub-command. The decomposition
requires changes in parse analysis/transform. We also need new logic
to assemble the decomposed DDL commands string back from the parse
trees so we can log them for logical replication.

2. Force skipping table rewrite when executing the same command on the
subscriber, and let DML replication replicate the table rewrite from
the publisher. The problem is table rewrite is not replicated at all
today, and it doesn't seem easy to just enable it for logical
replication. Table rewrite is an expensive operation involving heap
file swap, details can be found in ATRewriteTables().

In light of this, I propose to temporarily block replication of such
DDL command on the replication worker until we figure out a better
solution. This is implemented in patch
0008-Fail-replication-worker-on-DDL-command-that-rewrites.patch.
Notice only DDL statements that rewrite table using a VOLATILE
expression will be blocked. I don't see a problem replicating
non-volatile expression.
Here is the github commit of the same patch:
https://github.com/zli236/postgres/commit/1e6115cb99a1286a61cb0a6a088f7476da29d0b9

> The ALTER TABLE ... ALTER COLUMN ... TYPE has a similar issue. This DDL command
> can be decomposed to avoid the rewrite. If you are changing the data type, in
> general, you add a new column and updates all rows doing the proper conversion.
> (If you are updating in batches, you usually add a trigger to automatically
> adjust the new column value for INSERTs and UPDATEs. Another case is when you
> are reducing the the typmod (for example, varchar(100) to varchar(20)). In this
> case, the DDL command can de decomposed removing the typmod information (ALTER
> TABLE ... ALTER COLUMN ... TYPE varchar) and replacing it with a CHECK
> constraint.

I tested ALTER TABLE ... ALTER COLUMN ... TYPE. It seems to be working
fine. Is there a particular case you're concerned about?

> --
> Euler Taveira
> EDB https://www.enterprisedb.com/
>

Regards,
Zheng Li

Attachment Content-Type Size
0008-Fail-replication-worker-on-DDL-command-that-rewrites.patch application/octet-stream 4.8 KB

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jan Wieck 2022-04-29 17:35:26 Re: External psql editor
Previous Message Rich Shepard 2022-04-29 17:27:29 Mail list manager request

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-04-29 17:46:33 Re: [RFC] building postgres with meson -v8
Previous Message Tom Lane 2022-04-29 15:53:15 Re: fix cost subqueryscan wrong parallel cost