Re: tablesync copy ignores publication actions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: tablesync copy ignores publication actions
Date: 2022-06-08 04:10:22
Message-ID: CAA4eK1Lb5QpWCQU8qkELnX6t8z7JeVtGantmKptxkkpxnYnpHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 7, 2022 at 7:08 PM Euler Taveira <euler(at)eulerto(dot)com> wrote:
>
> On Tue, Jun 7, 2022, at 1:10 AM, Peter Smith wrote:
>
> The logical replication tablesync ignores the publication 'publish'
> operations during the initial data copy.
>
> This is current/known PG behaviour (e.g. as recently mentioned [1])
> but it was not documented anywhere.
>
> initial data synchronization != replication. publish parameter is a replication
> property; it is not a initial data synchronization property. Maybe we should
> make it clear like you are suggesting.
>

+1 to document it. We respect some other properties of publication
like the publish_via_partition_root parameter, column lists, and row
filters. So it is better to explain about 'publish' parameter which we
ignore during the initial sync.

> This patch just documents the existing behaviour and gives some examples.
>
> Why did you add this information to that specific paragraph? IMO it belongs to
> a separate paragraph; I would add it as the first paragraph in that subsection.
>
> I suggest the following paragraph:
>
> <para>
> The initial data synchronization does not take into account the
> <literal>publish</literal> parameter to copy the existing data.
> </para>
>
> There is no point to link the Initial Snapshot subsection. That subsection is
> explaining the initial copy steps and you want to inform about the effect of a
> publication parameter on the initial copy. Although both are talking about the
> same topic (initial copy), that link to Initial Snapshot subsection won't add
> additional information about the publish parameter.
>

Here, we are explaining the behavior of row filters during initial
sync so adding a link to the Initial Snapshot section makes sense to
me.

> You could expand the
> suggested sentence to make it clear what publish parameter is or even add a
> link to the CREATE PUBLICATION synopsis (that explains about publish
> parameter).
>

+1. I suggest that we should add some text about the behavior of
initial sync in CREATE PUBLICATION doc (along with the 'publish'
parameter) or otherwise, we can explain it where we are talking about
publications [1].

> You add an empty paragraph. Remove it.
>
> I'm not sure it deserves an example. It is an easy-to-understand concept and a
> good description is better than ~ 80 new lines.
>

I don't think it is very clear that "initial data synchronization !=
replication" as mentioned by you nor does our docs does a good job in
explaining it otherwise the confusion wouldn't have arisen in the
email link shared by Peter. Personally, I think such things can be
better explained by example and in that regards the example shared by
Peter does half the job because it doesn't explain the replication
part. I don't think "Initial Snapshot" is the right place for these
examples considering we want to show the replication based on the
publish actions. We can extend it to show one example with row filters
as well. How about showing these examples in the Subscription section
[2]?

[1]: https://www.postgresql.org/docs/devel/logical-replication-publication.html
[2]: https://www.postgresql.org/docs/devel/logical-replication-subscription.html

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2022-06-08 05:05:09 Re: Error from the foreign RDBMS on a foreign table I have no privilege on
Previous Message Kyotaro Horiguchi 2022-06-08 04:08:16 Re: Error from the foreign RDBMS on a foreign table I have no privilege on