Re: Data is copied twice when specifying both child and parent table in publication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, vignesh C <vignesh21(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>
Subject: Re: Data is copied twice when specifying both child and parent table in publication
Date: 2023-03-27 03:31:37
Message-ID: CAA4eK1KgbapE0FWRmF1SX7uunejuKkXxke7iWKWtt-KKHBTovA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 27, 2023 at 7:03 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
>
> 1.
> +# two publications, one publishing through ancestor and another one directly
> +# publsihing the partition, with different row filters
> +$node_publisher->safe_psql('postgres',
> + "CREATE PUBLICATION tap_pub_viaroot_sync_1 FOR TABLE
> tab_rowfilter_viaroot_part_sync WHERE (a > 15) WITH
> (publish_via_partition_root)"
> +);
> +$node_publisher->safe_psql('postgres',
> + "CREATE PUBLICATION tap_pub_viaroot_sync_2 FOR TABLE
> tab_rowfilter_viaroot_part_sync_1 WHERE (a < 15)"
> +);
> +
>
> 1a.
> Typo "publsihing"
>
> ~
>
> 1b.
> IMO these table and publication names could be better.
>
> I thought it was confusing to have the word "sync" in these table
> names and publication names. To the casual reader, it looks like these
> are synchronous replication tests but they are not.
>

Hmm, sync here is for initial sync, so I don't think it is too much of
a problem to understand if one is aware that these are logical
replication related tests.

> Similarly, I thought it was confusing that 2nd publication and table
> have names with the word "viaroot" when the option
> publish_via_partition_root is not even true.
>

I think the better names for tables could be
"tab_rowfilter_parent_sync, tab_rowfilter_child_sync" and for
publications "tap_pub_parent_sync_1,
tap_pub_child_sync_1"

> ~~~
>
> 2.
>
> # The following commands are executed after CREATE SUBSCRIPTION, so these SQL
> # commands are for testing normal logical replication behavior.
> #
>
> ~
>
> I think you should add a couple of INSERTS for the newly added table/s
> also. IMO it is not only better for test completeness, but it causes
> readers to question why there are INSERTS for every other table except
> these ones.
>

The purpose of the test is to test the initial sync's interaction with
'publish_via_partition_root' option. So, adding Inserts after that for
replication doesn't serve any purpose and it also consumes test cycles
without any additional benefit. So, -1 for extending it further.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirk Wolak 2023-03-27 03:36:10 Re: possible proposal plpgsql GET DIAGNOSTICS oid = PG_ROUTINE_OID
Previous Message Tom Lane 2023-03-27 03:15:46 Re: About the constant-TRUE clause in reconsider_outer_join_clauses