Re: Data is copied twice when specifying both child and parent table in publication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: Data is copied twice when specifying both child and parent table in publication
Date: 2022-09-27 08:44:58
Message-ID: CAHut+PtTN1Udug3x1eZnDdh8Z6PL5VN=xz1b44VbB9T1z_PtcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are my review comments for the HEAD_v11-0001 patch:

======

1. General - Another related bug?

In [1] Hou-san wrote:

For another case you mentioned (via_root used when publishing child)
CREATE PUBLICATION pub1 for TABLE parent;
CREATE PUBLICATION pub2 for TABLE child with (publish_via_partition_root);
CREATE SUBSCRIPTION sub connect xxx PUBLICATION pub1,pub2;

The expected behavior is only the child table is published, all the changes
should be replicated using the child table's identity. We should do table sync
only for child tables and is same as the current behavior on HEAD. So, I think
there is no bug in this case.

~

That behaviour seems different to my understanding because the pgdocs
says when the via_root param is true the 'child' table would be using
the 'parent' identity:

[2] publish_via_partition_root determines whether changes in a
partitioned table (or on its partitions) contained in the publication
will be published using the identity and schema of the partitioned
table rather than that of the individual partitions that are actually
changed.

~

So is this another bug (slightly different from the current one being
patched), or is it just some different special behaviour? If it's
another bug then you need to know that ASAP because I think you may
want to fix both of them at the same time which might impact how this
2x data copy patch should be implemented.

Or perhaps just the pgdocs need more notes about special
cases/combinations like this?

======

2. General - documentation?

For this current patch, IIUC it was decided that it is a bug because
the data gets duplicated, and then some sensible rule was decided that
this patch should use to address it (e.g. publishing a child combined
with publishing a parent via_root will just ignore the child's
publication...).

So my question is - is this (new/fixed) behaviour something that a
user will be able to figure out themselves from the existing
documentation, or does this patch now need its own special notes in
the documentation?

======

3. src/backend/catalog/pg_publication.c - pg_get_publication_tables

+ foreach(lc, pub_elem_tables)
+ {
+ Oid *result = (Oid *) malloc(sizeof(Oid) * 2);
+
+ result[0] = lfirst_oid(lc);
+ result[1] = pub_elem->oid;
+ table_infos = lappend(table_infos, result);
+ }

3a.
It looks like each element in the table_infos list is a malloced obj
of 2x Oids (Oid of table, Oid of pub). IMO better to call this element
'table_info' instead of the meaningless 'result'

~

3b.
Actually, I think it would be better if this function defines a little
2-element structure {Oid relid, Oid pubid} to use instead of this
array (which requires knowledge that [0] means relid and [1] means
pubid).

~~~

4.

+ foreach(lc, table_infos)
+ {
+ Oid *table_info_tmp = (Oid *) lfirst(lc);
+
+ if (!list_member_oid(tables, table_info_tmp[0]))
+ table_infos = foreach_delete_current(table_infos, lc);
}
I think the '_tmp' suffix is not helpful here - IMO having another
relid variable would make this more self-explanatory.

Or better yet adopt the suggestion o f #3b and have a little struct
with self-explanatory member names.

=====

5. src/backend/commands/subscriptioncmds.c - fetch_table_list

+ if (server_version >= 160000)
+ appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"

Since there is an else statement block, I think this would be more
readable if there was a statement block here too. YMMV

SUGGESTION
if (server_version >= 160000)
{
appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
...
}

~~~

6.

+ /*
+ * Get the list of tables from publisher, the partition table whose
+ * ancestor is also in this list will be ignored, otherwise the initial
+ * data in the partition table would be replicated twice.
+ */

6a.
"from publisher, the partition" -> "from the publisher. The partition"

~

6b.
This looks like a common comment that also applied to the "if" part,
so it seems more appropriate to move it to where I indicated below.
Perhaps the whole comment needs a bit of massaging after you move
it...

+ /*
+ * Get namespace, relname and column list (if supported) of the tables
+ * belonging to the specified publications.
+ *
+ * HERE <<<<<<<<<
+ *
+ * From version 16, the parameter of the function pg_get_publication_tables
+ * can be an array of publications. The partition table whose ancestor is
+ * also published in this publication array will be filtered out in this
+ * function.
+ */

------
[1] https://www.postgresql.org/message-id/OS0PR01MB5716A30DDEECC59132E1084F94799%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2] https://www.postgresql.org/docs/devel/sql-createpublication.html

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2022-09-27 08:47:45 Re: Fast COPY FROM based on batch insert
Previous Message Bharath Rupireddy 2022-09-27 08:33:24 Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)