Re: Support logical replication of DDLs

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Zheng Li <zhengli10(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support logical replication of DDLs
Date: 2022-03-13 11:35:18
Message-ID: CAFiTN-tM1zOk50xoeP5vdzaO0U_iQ3u8oZ2P1L7J=0ka+c7Utw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Mon, Feb 21, 2022 at 9:43 PM Zheng Li <zhengli10(at)gmail(dot)com> wrote:
>
> Hello,
>
> One of the most frequently requested improvements from our customers
> is to reduce downtime associated with software updates (both major and
> minor versions). To do this, we have reviewed potential contributions to
> improving logical replication.
>
> I’m working on a patch to support logical replication of data
> definition language statements (DDLs). This is a useful feature when a
> database in logical replication has lots of tables, functions and
> other objects that change over time, such as in online cross major
> version upgrade.

+1

> I put together a prototype that replicates DDLs using the generic
> messages for logical decoding. The idea is to log the candidate DDL
> string in ProcessUtilitySlow() using LogLogicalMessge() with a new
> flag in WAL record type xl_logical_message indicating it’s a DDL
> message. The xl_logical_message record is decoded and sent to the
> subscriber via pgoutput. The logical replication worker process is
> dispatched for this new DDL message type and executes the command
> accordingly.

If you don't mind, would you like to share the POC or the branch for this work?

> However, there are still many edge cases to sort out because not every
> DDL statement can/should be replicated. Some of these include:

> 3. CREATE TABLE AS and SELECT INTO, For example:
>
> CREATE TABLE foo AS
> SELECT field_1, field_2 FROM bar;
>
> There are a few issues that can occur here. For one, it’s possible
> that table bar doesn't exist on the subscriber. Even if “bar” does
> exist, it may not be fully up-to-date with the publisher, which would
> cause a data mismatch on “foo” between the publisher and subscriber.

In such cases why don't we just log the table creation WAL for DDL
instead of a complete statement which creates the table and inserts
the tuple? Because we are already WAL logging individual inserts and
once you make sure of replicating the table creation I think the exact
data insertion on the subscriber side will be taken care of by the
insert WALs no?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Roger Bos 2022-03-13 14:44:14 delete query using CTE
Previous Message Francisco Olarte 2022-03-13 08:53:49 Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhihong Yu 2022-03-13 14:05:10 Re: BufferAlloc: don't take two simultaneous locks
Previous Message Brar Piening 2022-03-13 10:26:25 Re: Add id's to various elements in protocol.sgml