RE: Data is copied twice when specifying both child and parent table in publication

From: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: RE: Data is copied twice when specifying both child and parent table in publication
Date: 2022-07-21 10:07:36
Message-ID: OS3PR01MB6275F576802D9FD34C2836849E919@OS3PR01MB6275.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thur, Jul 14, 2022 at 12:46 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> Here are some review comments for the v6 patch (HEAD only):

Thanks for your comments.

> 1. Commit message
>
> If there are two publications that publish the parent table and the child table
> separately, and both specify the option PUBLISH_VIA_PARTITION_ROOT,
> subscribing
> to both publications from one subscription causes initial copy twice. What we
> expect is to be copied only once.
>
> ~
>
> I don’t think the parameter even works in uppercase, so maybe better to say:
> PUBLISH_VIA_PARTITION_ROOT -> 'publish_via_partition_root'

It seems that there are more places to use lowercase than uppercase, so
improved it as suggested.

> 2.
>
> What we expect is to be copied only once.
>
> SUGGESTION
> It should only be copied once.
>
> ~~~
>
> 3.
>
> To fix this, we extend the API of the function pg_get_publication_tables.
> Now, the function pg_get_publication_tables could receive the publication list.
> And then, if we specify option viaroot, we could exclude the partitioned table
> whose ancestor belongs to the publication list when getting the table
> informations.
>
> ~
>
> Don't you mean "partition table" instead of "partitioned table"?
>
> SUGGESTION (also reworded)
> To fix this, the API function pg_get_publication_tables has been
> extended to take a publication list. Now, when getting the table
> information, if the publish_via_partition_root is true, the function
> can exclude a partition table whose ancestor is also published by the
> same publication list.

Improved and fixed as suggested.

> 4. src/backend/catalog/pg_publication.c - pg_get_publication_tables
>
> - publication = GetPublicationByName(pubname, false);
> + arr = PG_GETARG_ARRAYTYPE_P(0);
> + deconstruct_array(arr, TEXTOID, -1, false, TYPALIGN_INT,
> + &elems, NULL, &nelems);
>
> Maybe should have some comment to describe that this function
> parameter is now an array of publications names.

Add the following comment: `/* Deconstruct the parameter into elements. */`.
Also improved the comment above the function pg_get_publication_tables:
`Returns information of tables in one or more publications.`
-->
`Returns information of the tables in the given publication array.`

> 5.
>
> + /* get Oids of tables from each publication */
>
> Uppercase comment

Improved as suggested.

> 6.
>
> + ArrayType *arr;
> + Datum *elems;
> + int nelems,
> + i;
> + Publication *publication;
> + bool viaroot = false;
> + List *pub_infos = NIL;
> + ListCell *lc1,
> + *lc2;
>
> The 'publication' should be declared only in the loop that uses it.
> It's also not good that this is shadowing the same variable name in a
> later declaration.

Reverted changes to variable "publication" declarations.

> 7.
>
> + * Publications support partitioned tables, although all changes
> + * are replicated using leaf partition identity and schema, so we
> + * only need those.
> */
> + if (publication->alltables)
> + current_tables = GetAllTablesPublicationRelations(publication->pubviaroot);
> + else
> + {
> + List *relids,
> + *schemarelids;
> +
> + relids = GetPublicationRelations(publication->oid,
> + publication->pubviaroot ?
> + PUBLICATION_PART_ROOT :
> + PUBLICATION_PART_LEAF);
> + schemarelids = GetAllSchemaPublicationRelations(publication->oid,
> + publication->pubviaroot ?
> + PUBLICATION_PART_ROOT :
> + PUBLICATION_PART_LEAF);
> + current_tables = list_concat(relids, schemarelids);
> + }
>
> Somehow I was confused by this comment because it says you only need
> the LEAF tables but then the subsequent code is getting ROOT relations
> anyway... Can you clarify the comment some more?

I think this is a slight mistake when publication parameter
"publish_via_partition_root" was introduced before.
I improved the comment to the following:
```
Publications support partitioned tables. If
publish_via_partition_root is false, all changes are replicated
using leaf partition identity and schema, so we only need those.
Otherwise, If publish_via_partition_root is true, get the
partitioned table itself.
```

> 8.
>
> + bool viaroot = false;
>
> I think that should have a comment something like:
> /* At least one publication is using publish_via_partition_root */

Improved as suggested.

> 9.
>
> + /*
> + * Record the published table and the corresponding publication so
> + * that we can get row filters and column list later.
> + */
> + foreach(lc1, tables)
> + {
> + Oid relid = lfirst_oid(lc1);
> +
> + foreach(lc2, pub_infos)
> + {
> + pub_info *pubinfo = (pub_info *) lfirst(lc2);
> +
> + if (list_member_oid(pubinfo->table_list, relid))
> + {
> + Oid *result = (Oid *) malloc(sizeof(Oid) * 2);
> +
> + result[0] = relid;
> + result[1] = pubinfo->pubid;
> +
> + results = lappend(results, result);
> + }
> + }
> }
>
> I felt a bit uneasy about the double-looping here. I wonder if these
> 'results' could have been accumulated within the existing loop over
> all publications. Then the results would need to be filtered to remove
> the ones associated with removed partitions. Otherwise with 10000
> tables and also many publications this (current) double-looping seems
> like it might be quite expensive.

Improved as suggested.

> 10. src/backend/commands/subscriptioncmds.c - fetch_table_list
>
> + if (check_columnlist && server_version >= 160000)
>
> This condition does not make much sense to me. Isn’t it effectively
> same as saying
> if (server_version >= 150000 && server_version >= 160000)
>
> ???

Fixed as suggested.

> 11.
>
> + /*
> + * Get the list of tables from publisher, the partitioned table whose
> + * ancestor is also in this list should be ignored, otherwise the
> + * initial date in the partitioned table would be replicated twice.
> + */
>
> 11.a
> Isn't this comment all backwards? I think you mean to say "partition"
> or "partition table" (not partitioned table) because partitions have
> ancestors but partition-ED tables don't.
>
>
> 11.b
> "initial date" -> "initial data"

Fixed as suggested.

> 12. src/test/subscription/t/013_partition.pl
>
> -# Note: We create two separate tables, not a partitioned one, so that we can
> -# easily identity through which relation were the changes replicated.
> +# Note: We only create one table for the partition table (tab4) here.
> +# Because we specify option PUBLISH_VIA_PARTITION_ROOT (see pub_all and
> +# pub_lower_level above), all data should be replicated to the partition table.
> +# So we do not need to create table for the partitioned table.
>
> 12.a
> AFAIK "tab4" is the *partitioned* table, not a partition. I think this
> comment has all the "partitioned" and "partition" back-to-front.
>
> 12.b
> Also please say “publish_via_partition_root" instead of
> PUBLISH_VIA_PARTITION_ROOT

Fixed as suggested.

> 13. src/test/subscription/t/028_row_filter.pl
>
> @@ -723,8 +727,11 @@ is($result, qq(t|1), 'check replicated rows to
> tab_rowfilter_toast');
> # - INSERT (16) YES, 16 > 15
> $result =
> $node_subscriber->safe_psql('postgres',
> - "SELECT a FROM tab_rowfilter_viaroot_part");
> -is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part');
> + "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1");
> +is($result, qq(16
> +17),
> + 'check replicated rows to tab_rowfilter_viaroot_part'
> +);
>
> There is a comment above that code like:
> # tab_rowfilter_viaroot_part filter is: (a > 15)
> # - INSERT (14) NO, 14 < 15
> # - INSERT (15) NO, 15 = 15
> # - INSERT (16) YES, 16 > 15
>
> I think should modify that comment to explain the new data this patch
> inserts - e.g. NO for 13 and YES for 17...

Improved as suggested.

I also improved the patches for back-branch according to some of Peter's
comments and added the back-branch patch for REL_13.
In addition, in the patch (REL15_v6) I attached for REL15 in [1], I forgot to
remove the modification to the function pg_get_publication_tables. I removed
related modifications now (REL15_v7).

Attach the new patches.

[1] - https://www.postgresql.org/message-id/OS3PR01MB6275AFA91925615A4AA782D09EBD9%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei

Attachment Content-Type Size
HEAD_v7-0001-Fix-data-replicated-twice-when-specifying-publish.patch application/octet-stream 17.7 KB
REL15_v7-0001-Fix-data-replicated-twice-when-specifying-publish_patch application/octet-stream 8.5 KB
REL14_v7-0001-Fix-data-replicated-twice-when-specifying-publish_patch application/octet-stream 5.3 KB
REL13_v7-0001-Fix-data-replicated-twice-when-specifying-publish_patch application/octet-stream 5.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-07-21 10:08:11 Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Previous Message Kyotaro Horiguchi 2022-07-21 09:02:59 Re: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?