Re: Data is copied twice when specifying both child and parent table in publication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, vignesh C <vignesh21(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>
Subject: Re: Data is copied twice when specifying both child and parent table in publication
Date: 2023-03-20 07:32:15
Message-ID: CAHut+PuNsvO9o9XzeJuSLsAsndgCKVphDPBRqYuOTy2bR28E+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for v17-0001.

======
src/backend/catalog/pg_publication.c

1. filter_partitions

-static List *
-filter_partitions(List *relids)
+static void
+filter_partitions(List *table_infos)
{
- List *result = NIL;
ListCell *lc;
- ListCell *lc2;

- foreach(lc, relids)
+ foreach(lc, table_infos)
{
- bool skip = false;
- List *ancestors = NIL;
- Oid relid = lfirst_oid(lc);
+ bool skip = false;
+ List *ancestors = NIL;
+ ListCell *lc2;
+ published_rel *table_info = (published_rel *) lfirst(lc);

- if (get_rel_relispartition(relid))
- ancestors = get_partition_ancestors(relid);
+ if (get_rel_relispartition(table_info->relid))
+ ancestors = get_partition_ancestors(table_info->relid);

foreach(lc2, ancestors)
{
Oid ancestor = lfirst_oid(lc2);
+ ListCell *lc3;

/* Check if the parent table exists in the published table list. */
- if (list_member_oid(relids, ancestor))
+ foreach(lc3, table_infos)
{
- skip = true;
- break;
+ Oid relid = ((published_rel *) lfirst(lc3))->relid;
+
+ if (relid == ancestor)
+ {
+ skip = true;
+ break;
+ }
}
+
+ if (skip)
+ break;
}

- if (!skip)
- result = lappend_oid(result, relid);
+ if (skip)
+ table_infos = foreach_delete_current(table_infos, lc);
}
-
- return result;
}

It seems the 'skip' and 'ancestors' and 'lc2' vars are not needed
except when "if (get_rel_relispartition(table_info->relid))" is true,
so won't it be better to restructure the code to put everything inside
that condition. Then you will save a few unnecessary tests of
foreach(lc2, ancestors) and (skip).

For example,

static void
filter_partitions(List *table_infos)
{
ListCell *lc;

foreach(lc, table_infos)
{
published_rel *table_info = (published_rel *) lfirst(lc);

if (get_rel_relispartition(table_info->relid))
{
bool skip = false;
List *ancestors = get_partition_ancestors(table_info->relid);
ListCell *lc2;

foreach(lc2, ancestors)
{
Oid ancestor = lfirst_oid(lc2);
ListCell *lc3;
/* Check if the parent table exists in the published table list. */
foreach(lc3, table_infos)
{
Oid relid = ((published_rel *) lfirst(lc3))->relid;

if (relid == ancestor)
{
skip = true;
break;
}
}
if (skip)
break;
}

if (skip)
table_infos = foreach_delete_current(table_infos, lc);
}
}
}

~~~

2. pg_get_publication_tables

+ else
+ {
+ List *relids,
+ *schemarelids;
+
+ relids = GetPublicationRelations(pub_elem->oid,
+ pub_elem->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
+ pub_elem->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
+ }

2a.
Maybe 'schema_relids' would be a better name than 'schemareliids'?

~

2b.
By introducing another variable maybe you could remove some of this
duplicated code.

PublicationPartOpt root_or_leaf = pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT : PUBLICATION_PART_LEAF;

~~~

3. pg_get_publication_tables

/* Show all columns when the column list is not specified. */
- if (nulls[1] == true)
+ if (nulls[2] == true)

Since you are changing this line anyway, you might as well change it
to remove the redundant "== true" part.

SUGGESTION
if (nulls[2])

======
src/include/catalog/pg_proc.dat

4.
+{ oid => '6119',
+ descr => 'get information of the tables in the given publication array',

Should that be worded in a way to make it more clear that the
"publication array" is really an "array of publication names"?

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-03-20 07:46:08 Re: Missing rules for queryjumblefuncs.{funcs,switch}.c for maintainer-clean
Previous Message Michael Paquier 2023-03-20 07:32:14 Re: Remove nonmeaningful prefixes in PgStat_* fields