RE: Data is copied twice when specifying both child and parent table in publication

From: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: RE: Data is copied twice when specifying both child and parent table in publication
Date: 2022-05-12 01:47:32
Message-ID: TYCPR01MB8373BEA17F82CBFC5121AF65EDCB9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, May 11, 2022 11:33 AM I wrote:
> On Monday, May 9, 2022 10:51 AM wangw(dot)fnst(at)fujitsu(dot)com
> <wangw(dot)fnst(at)fujitsu(dot)com> wrote:
> > Attach new patches.
> > The patch for HEAD:
> > 1. Modify the approach. Enhance the API of function
> > pg_get_publication_tables to handle one publication or an array of
> > publications.
> > The patch for REL14:
> > 1. Improve the table sync SQL. [suggestions by Shi yu]
> Hi, thank you for updating the patch !
>
> Minor comments on your patch for HEAD v2.
>
> (1) commit message sentence
>
> I suggest below sentence.
>
> Kindly change from
> "... when subscribing to both publications using one subscription, the data is
> replicated twice in inital copy"
> to "subscribing to both publications from one subscription causes initial copy
> twice".
>
> (2) unused variable
>
> pg_publication.c: In function ‘pg_get_publication_tables’:
> pg_publication.c:1091:11: warning: unused variable ‘pubname’
> [-Wunused-variable]
> char *pubname;
>
> We can remove this.
>
> (3) free of allocated memory
>
> In the pg_get_publication_tables(),
> we don't free 'elems'. Don't we need it ?
>
> (4) some coding alignments
>
> 4-1.
>
> + List *tables_viaroot = NIL,
> ...
> + *current_table = NIL;
>
> I suggest we can put some variables
> into the condition for the first time call of this function, like tables_viaroot and
> current_table.
> When you agree, kindly change it.
>
> 4-2.
>
> + /*
> + * Publications support partitioned tables, although
> all changes
> + * are replicated using leaf partition identity and
> schema, so we
> + * only need those.
> + */
> + if (publication->alltables)
> + {
> + current_table =
> GetAllTablesPublicationRelations(publication->pubviaroot);
> + }
>
> This is not related to the change itself and now we are inheriting the previous
> curly brackets, but I think there's no harm in removing it, since it's only for one
> statement.
Hi,

One more thing I'd like to add is that
we don't hit the below code by tests.
In the HEAD v2, we add a new filtering logic in pg_get_publication_tables.
Although I'm not sure if this is related to the bug fix itself,
when we want to include it in this patch, then
I feel it's better to add some simple test for this part,
to cover all the new main paths and check if
new logic works correctly.

+ /*
+ * If a partition table is published in a publication with viaroot,
+ * and its parent or child table is published in another publication
+ * without viaroot. Then we need to move the parent or child table
+ * from tables to tables_viaroot.
+ *
+ * If all publication(s)'s viaroot are the same, then skip this part.
+ */

....
if (ancestor_viaroot == ancestor)
+ {
+ tables = foreach_delete_current(tables, lc2);
+ change_tables = list_append_unique_oid(change_tables,
+ relid);
+ }

Best Regards,
Takamichi Osumi

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2022-05-12 01:59:04 Re: gitmaster access
Previous Message Bruce Momjian 2022-05-12 01:42:46 Re: gitmaster access