Re: Data is copied twice when specifying both child and parent table in publication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, vignesh C <vignesh21(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>
Subject: Re: Data is copied twice when specifying both child and parent table in publication
Date: 2023-03-24 06:16:53
Message-ID: CAA4eK1+3YeJyFAanTn2reM6ssPqusPsucb5z4Ufwc=rBsFqWOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 24, 2023 at 7:19 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Wang-san. I looked at the v21-0001 patch.
>
> I don't have any new review comments -- only follow-ups for some of my
> previous v20 comments that were rejected.
>
> On Thu, Mar 23, 2023 at 8:11 PM wangw(dot)fnst(at)fujitsu(dot)com
> <wangw(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Thu, Mar 23, 2023 at 12:27 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > > Here are some review comments for patch v20-0001.
> >
> ...
> > > ======
> > > src/test/subscription/t/013_partition.pl
> > >
> > > 5.
> > > I think maybe you could have another test scenario where you INSERT
> > > something into tab4_1_1 while only the publication for tab4_1 has
> > > publish_via_partition_root=true
> >
> > I'm not sure if this scenario is necessary.
> >
>
> Please see my reply for #7 below.
>
> > > ~~~
> > >
> > > 6.
> > > AFAICT the tab4 tests are only testing the initial sync, but are not
> > > testing normal replication. Maybe some more normal (post sync) INSERTS
> > > are needed to tab4, tab4_1, tab4_1_1 ?
> >
> > Since I think the scenario we fixed is sync and not replication, it doesn't seem
> > like we should extend the test you mentioned.
> >
>
> Maybe you are right. I only thought it would be better to have testing
> which verifies that the sync phase and the normal replication phase
> are using the same rules.
>

Yeah, we could extend such tests if we want but I think it is not a
must as the patch didn't change this behavior.

> > > ======
> > > src/test/subscription/t/028_row_filter.pl
> > >
> > > 7.
> > > +# insert data into partitioned table.
> > > +$node_publisher->safe_psql('postgres',
> > > + "INSERT INTO tab_rowfilter_viaroot_part(a) VALUES(13), (17)");
> > > +
> > > $node_subscriber->safe_psql('postgres',
> > > "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
> > > application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
> > > tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
> > > tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"
> > > );
> > > @@ -707,13 +711,17 @@ is($result, qq(t|1), 'check replicated rows to
> > > tab_rowfilter_toast');
> > > # the row filter for the top-level ancestor:
> > > #
> > > # tab_rowfilter_viaroot_part filter is: (a > 15)
> > > +# - INSERT (13) NO, 13 < 15
> > > # - INSERT (14) NO, 14 < 15
> > > # - INSERT (15) NO, 15 = 15
> > > # - INSERT (16) YES, 16 > 15
> > > +# - INSERT (17) YES, 17 > 15
> > > $result =
> > > $node_subscriber->safe_psql('postgres',
> > > - "SELECT a FROM tab_rowfilter_viaroot_part");
> > > -is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part');
> > > + "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1");
> > > +is( $result, qq(16
> > > +17),
> > > + 'check replicated rows to tab_rowfilter_viaroot_part');
> > >
> > > ~
> > >
> > > I'm not 100% sure this is testing quite what you want to test. AFAICT
> > > the subscription is created like:
> > >
> > > "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
> > > application_name=$appname' PUBLICATION tap_pub_1, tap_pub_2,
> > > tap_pub_3, tap_pub_4a, tap_pub_4b, tap_pub_5a, tap_pub_5b,
> > > tap_pub_toast, tap_pub_inherits, tap_pub_viaroot_2, tap_pub_viaroot_1"
> >
> > I think this is the scenario we fixed : Simultaneously subscribing to two
> > publications that publish the parent and child respectively, then want to use
> > the parent's identity and schema).
> >
>
> Yeah, but currently BOTH the tap_pub_viaroot_2, tap_pub_viaroot_1 are
> using "WITH (publish_via_partition_root)", so IMO the user would
> surely expect that only the root table would be published even when a
> subscription combines those publications. OTOH, I thought a subtle
> point of this patch is that now the same result will happen even if
> only ONE of the publications was using "WITH
> (publish_via_partition_root)". So it’s that scenario of “only ONE
> publication is using the option” that I thought ought to be explicitly
> tested.
>

The current change to existing tests is difficult to understand. I
suggest writing a separate test for row filter and then cover the
scenario you have suggested.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2023-03-24 06:23:20 Re: Orphaned wait event
Previous Message Brar Piening 2023-03-24 05:59:33 Re: doc: add missing "id" attributes to extension packaging page