Re: Allow logical replication to copy tables in binary format

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Melih Mutlu" <m(dot)melihmutlu(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow logical replication to copy tables in binary format
Date: 2022-08-11 02:03:04
Message-ID: 2ebc7ea8-3c46-474e-aea7-5d73ff6165fb@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 10, 2022, at 12:03 PM, Melih Mutlu wrote:
> I see that logical replication subscriptions have an option to enable binary [1].
> When it's enabled, subscription requests publisher to send data in binary format.
> But this is only the case for apply phase. In tablesync, tables are still copied as text.
This option could have been included in the commit 9de77b54531; it wasn't.
Maybe it wasn't considered because the initial table synchronization can be a
separate step in your logical replication setup idk. I agree that the binary
option should be available for the initial table synchronization.

> To copy tables, COPY command is used and that command supports copying in binary. So it seemed to me possible to copy in binary for tablesync too.
> I'm not sure if there is a reason to always copy tables in text format. But I couldn't see why not to do it in binary if it's enabled.
The reason to use text format is that it is error prone. There are restrictions
while using the binary format. For example, if your schema has different data
types for a certain column, the copy will fail. Even with such restrictions, I
think it is worth adding it.

> You can find the small patch that only enables binary copy attached.
I have a few points about your implementation.

* Are we considering to support prior Postgres versions too? These releases
support binary mode but it could be an unexpected behavior (initial sync in
binary mode) for a publisher using 14 or 15 and a subscriber using 16. IMO
you should only allow it for publisher on 16 or later.
* Docs should say that the binary option also applies to initial table
synchronization and possibly emphasize some of the restrictions.
* Tests. Are the current tests enough? 014_binary.pl.

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2022-08-11 02:50:54 Re: optimize lookups in snapshot [sub]xip arrays
Previous Message Julien Rouhaud 2022-08-11 01:12:40 Re: Get the statistics based on the application name and IP address