Re: Initial Schema Sync for Logical Replication

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Sachin Kumar" <ssetiya(at)amazon(dot)com>, "Alvaro Herrera" <alvherre(at)alvh(dot)no-ip(dot)org>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Initial Schema Sync for Logical Replication
Date: 2023-03-22 21:17:49
Message-ID: 78149fa6-4c77-4128-8518-197a631c29c3@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 21, 2023, at 8:18 AM, Amit Kapila wrote:
> Now, how do we avoid these problems even if we have our own version of
> functionality similar to pg_dump for selected objects? I guess we will
> face similar problems. If so, we may need to deny schema sync in any
> such case.
There are 2 approaches for initial DDL synchronization:

1) generate the DDL command on the publisher, stream it and apply it as-is on
the subscriber;
2) generate a DDL representation (JSON, for example) on the publisher, stream
it, transform it into a DDL command on subscriber and apply it.

The option (1) is simpler and faster than option (2) because it does not
require an additional step (transformation). However, option (2) is more
flexible than option (1) because it allow you to create a DDL command even if a
feature was removed from the subscriber and the publisher version is less than
the subscriber version or a feature was added to the publisher and the
publisher version is greater than the subscriber version. Of course there are
exceptions and it should forbid the transformation (in this case, it can be
controlled by the protocol version -- LOGICALREP_PROTO_FOOBAR_VERSION_NUM). A
decision must be made: simple/restrict vs complex/flexible.

One of the main use cases for logical replication is migration (X -> Y where X
< Y). Postgres generally does not remove features but it might happen (such as
WITH OIDS syntax) and it would break the DDL replication (option 1). In the
downgrade case (X -> Y where X > Y), it might break the DDL replication if a
new syntax is introduced in X. Having said that, IMO option (1) is fragile if
we want to support DDL replication between different Postgres versions. It
might eventually work but there is no guarantee.

Per discussion [1], I think if we agree that the Alvaro's DDL deparse patch is
the way to go with DDL replication, it seems wise that it should be used for
initial DDL synchronization as well.

[1] https://www.postgresql.org/message-id/CAA4eK1%2Bw_dFytBiv3RxbOL76_noMzmX0QGTc8uS%3Dbc2WaPVoow%40mail.gmail.com

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Imseih (AWS), Sami 2023-03-22 21:35:23 Re: [BUG] pg_stat_statements and extended query protocol
Previous Message Peter Geoghegan 2023-03-22 21:14:38 Re: HOT chain validation in verify_heapam()