Re: Allow logical replication to copy tables in binary format

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow logical replication to copy tables in binary format
Date: 2022-10-03 11:50:25
Message-ID: CAGPVpCR_g8c1xXfR4kkSXiSt314C0it0seRzfqgTd+oF7N_5jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Takamichi,

Thanks for your reviews.

I addressed your reviews, please find the attached patch.

osumi(dot)takamichi(at)fujitsu(dot)com <osumi(dot)takamichi(at)fujitsu(dot)com>, 16 Eyl 2022 Cum,
16:51 tarihinde şunu yazdı:

> (1) whitespace issues
>

Fixed

(2) Suggestion to update another general description about the subscription
>
> Kindly have a look at doc/src/sgml/logical-replication.sgml.
>
> "The data types of the columns do not need to match,
> as long as the text representation of the data can be converted to the
> target type.
> For example, you can replicate from a column of type integer to a column
> of type bigint."
>
> With the patch, I think we have an impact about those descriptions
> since enabling the binary option for a subscription and executing the
> initial synchronization requires the same data types for binary format.
>
> I suggest that we update those descriptions as well.
>

You're right, this needs to be stated in the docs. Modified descriptions
accordingly.

> (3) shouldn't test that we fail expectedly with binary copy for different
> types ?
>
> How about having a test that we correctly fail with different data types
> between the publisher and the subscriber, for instance ?
>

Modified 002_types.pl test such that it now tests the replication between
different data types.
It's expected to fail if the binary is enabled, and succeed if not.

> (4) Error message of the binary format copy
>
> I've gotten below message from data type contradiction (between integer
> and bigint).
> Probably, this is unclear for the users to understand the direct cause
> and needs to be adjusted ?
> This might be a similar comment Euler mentioned in [1].
>
> 2022-09-16 11:54:54.835 UTC [4570] ERROR: insufficient data left in
> message
> 2022-09-16 11:54:54.835 UTC [4570] CONTEXT: COPY tab, line 1, column id
>

It's already unclear for users to understand what's the issue if they're
copying data between different column types via the COPY command.
This issue comes from COPY, and logical replication just utilizes COPY.
I don't think it would make sense to adjust an error message from a
functionality which logical replication only uses and has no direct impact
on.
It might be better to do this in a separate patch. What do you think?

> (5) Minor adjustment of the test comment in 002_types.pl.
>
> +is( $result, $sync_result, 'check initial sync on subscriber');
> +is( $result_binary, $sync_result, 'check initial sync on subscriber in
> binary');
>
> # Insert initial test data
>
> There are two same comments which say "Insert initial test data" in this
> file.
> We need to update them, one for the initial table sync and
> the other for the application of changes.
>

Fixed.

I agree with the direction to support binary copy for v16 and later.
>
> IIUC, the binary format replication with different data types fails even
> during apply phase on HEAD.
> I thought that means, the upgrade concern only applies to a scenario that
> the user executes
> only initial table synchronizations between the publisher and subscriber
> and doesn't replicate any data at apply phase after that. I would say
> this isn't a valid scenario and your proposal makes sense.
>

No, logical replication in binary does not fail on apply phase if data
types are different.
The concern with upgrade (if data types are not the same) would be not
being able to create a new subscription with binary enabled or replicate
new tables added into publication.
Replication of tables from existing subscriptions would not be affected by
this change since they will already be in the apply phase, not tablesync.
Do you think this would still be an issue?

Thanks,
Melih

Attachment Content-Type Size
v3-0001-Allow-logical-replication-to-copy-table-in-binary.patch application/octet-stream 24.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2022-10-03 11:55:40 Re: Tracking last scan time
Previous Message Ranier Vilela 2022-10-03 11:05:57 Re: Small miscellaneous fixes