Re: Data is copied twice when specifying both child and parent table in publication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Data is copied twice when specifying both child and parent table in publication
Date: 2021-11-12 04:27:39
Message-ID: CAA4eK1KvfB6LF1u1Q_v-S1fiSaMZZq125cMcg4xixfSf1N+zqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 11, 2021 at 12:22 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Friday, November 5, 2021 11:20 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> >On Thu, Nov 4, 2021 at 7:10 PM Amit Kapila <mailto:amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> >Almost.
> >The patch does seem to solve that first problem (double publish on tablesync).
> >I used the following test (taken from [2]), and variations of it:
> >
> >However, there did still seem to be a problem, if publish_via_partition_root is then set to false; it seems that can result in
> >duplicate partition entries in the pg_publication_tables view, see below (this follows on from the test scenario given above):
> >
> >postgres=# select * from pg_publication_tables;
> > pubname | schemaname | tablename
> >---------+------------+-----------
> > pub1 | sch1 | tbl1
> > pub1 | sch3 | t1
> >(2 rows)
> >
> >postgres=# alter publication pub1 set (publish_via_partition_root=false);
> >ALTER PUBLICATION
> >postgres=# select * from pg_publication_tables;
> > pubname | schemaname | tablename
> >---------+------------+------------
> > pub1 | sch2 | tbl1_part1
> > pub1 | sch2 | tbl1_part2
> > pub1 | sch2 | tbl1_part1
> > pub1 | sch3 | t1
> >(4 rows)
> >
> >So I think the patch would need to be updated to prevent that.
>
> Thanks for testing the patch.
>
> The reason of the duplicate output is that:
> The existing function GetPublicationRelations doesn't de-duplicate the output
> oid list. So, when adding both child and parent table to the
> publication(pubviaroot = false), the pg_publication_tables view will output
> duplicate partition.
>
> Attach the fix patch.
> 0001 fix data double publish(first issue in this thread)
> 0002 fix duplicate partition in view pg_publication_tables(reported by greg when testing the 0001 patch)
>

Can we start a separate thread to discuss the 0002 patch as that
doesn't seem directly to duplicate data issues being discussed here?
Please specify the exact test in the email as that would make it
easier to understand the problem.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2021-11-12 04:48:59 Re: Optionally automatically disable logical replication subscriptions on error
Previous Message Amit Kapila 2021-11-12 04:11:01 Re: Data is copied twice when specifying both child and parent table in publication