RE: Data is copied twice when specifying both child and parent table in publication

From: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, vignesh C <vignesh21(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>
Subject: RE: Data is copied twice when specifying both child and parent table in publication
Date: 2023-03-27 05:18:07
Message-ID: OS3PR01MB6275E3869C221300794B1C3E9E8B9@OS3PR01MB6275.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 27, 2023 at 11:32 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Mon, Mar 27, 2023 at 7:03 AM Peter Smith <smithpb2250(at)gmail(dot)com>
> wrote:
> >
> >
> > 1.
> > +# two publications, one publishing through ancestor and another one directly
> > +# publsihing the partition, with different row filters
> > +$node_publisher->safe_psql('postgres',
> > + "CREATE PUBLICATION tap_pub_viaroot_sync_1 FOR TABLE
> > tab_rowfilter_viaroot_part_sync WHERE (a > 15) WITH
> > (publish_via_partition_root)"
> > +);
> > +$node_publisher->safe_psql('postgres',
> > + "CREATE PUBLICATION tap_pub_viaroot_sync_2 FOR TABLE
> > tab_rowfilter_viaroot_part_sync_1 WHERE (a < 15)"
> > +);
> > +
> >
> > 1a.
> > Typo "publsihing"

Changed.

> > ~
> >
> > 1b.
> > IMO these table and publication names could be better.
> >
> > I thought it was confusing to have the word "sync" in these table
> > names and publication names. To the casual reader, it looks like these
> > are synchronous replication tests but they are not.
> >
>
> Hmm, sync here is for initial sync, so I don't think it is too much of
> a problem to understand if one is aware that these are logical
> replication related tests.
>
> > Similarly, I thought it was confusing that 2nd publication and table
> > have names with the word "viaroot" when the option
> > publish_via_partition_root is not even true.
> >
>
> I think the better names for tables could be
> "tab_rowfilter_parent_sync, tab_rowfilter_child_sync" and for
> publications "tap_pub_parent_sync_1,
> tap_pub_child_sync_1"

Changed. And removed "_1" in the suggested publication names.
Previously, I added "_1" and "_2" to distinguish between two publications with
the same name. However, the publication names are now different, so I think we
could remove "_1".

Attach the new patch.

Regards,
Wang Wei

Attachment Content-Type Size
v24-0001-Avoid-syncing-data-twice-for-the-publish_via_par.patch application/octet-stream 27.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2023-03-27 05:38:01 Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)
Previous Message Michael Paquier 2023-03-27 04:18:57 Re: Add pg_walinspect function with block info columns