Re: Skipping schema changes in publication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, YeXiu <1518981153(at)qq(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Skipping schema changes in publication
Date: 2025-11-24 07:32:39
Message-ID: CAHut+Pudi+9ssBR_Q_Fd29aGEu8s18OyKUGo5w5aKJK-2_c+8g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 21, 2025 at 5:55 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Shlok.
>
> Here are some review comments for your patch v28-0003 (EXCEPT TABLE ...).
>
> The review of this patch is a WIP. In this post I only looked at the test code.
>

Here are my remaining review comments for patch v28-0003 (EXCEPT TABLE ...).

======
doc/src/sgml/ref/create_publication.sgml

1.
-ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
ADD ALL TABLES
+ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
ADD ALL TABLES [ EXCEPT [ TABLE ] ( <replaceable
class="parameter">table_exception_object</replaceable> [, ... ] ) ]

Why is that optional [TABLE] keyword needed?

I know PostGres commands sometimes have "noise" words in the syntax so
the command can be more English-like, but in this case, the
publication is a FOR ALL *TABLES* anyway, so I am not sure what the
benefit is of the user being able to say TABLE a 2nd time?

======
src/backend/catalog/pg_publication.c

2.
+ /*
+ * Check for partitions of partitioned table which are specified with
+ * EXCEPT clause and partitioned table is published with
+ * publish_via_partition_root = true.
+ */

I think you can just say "partitions" or "table partitions", but
"partitions of [a] partitioned table" seems overkill.

Also, "... and partitioned table is published with
publish_via_partition_root = true." seems too wordy. Isn't that just
the same as "... and publish_via_partition_root = true"

SUGGESTION
Check for when the publication says "EXCEPT TABLE (partition)" but
publish_via_partition_root = true.

~~~

3.
-/* Gets list of publication oids for a relation */
+/* Gets list of publication oids for a relation that matches the except_flag */
List *
-GetRelationPublications(Oid relid)
+GetRelationPublications(Oid relid, bool except_flag)
{
List *result = NIL;
CatCList *pubrellist;
@@ -765,7 +791,8 @@ GetRelationPublications(Oid relid)
HeapTuple tup = &pubrellist->members[i]->tuple;
Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;

- result = lappend_oid(result, pubid);
+ if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept)
+ result = lappend_oid(result, pubid);
}

I was wondering if it might be better to return 2 lists from this
function (e.g. an included-list, and an excluded-list) instead of
passing the 'except_flag' like the current code. IIUC, you are mostly
calling this function twice to get 2 lists anyway, but returning 2
lists instead of 1, this function might be more efficient since it
will only process the publication loop once.

~~~

4.
/*
* Gets list of relation oids for a publication that matches the except_flag.
*
* This should only be used FOR TABLE publications, the FOR ALL TABLES/SEQUENCES
* should use GetAllPublicationRelations().
*/
List *
GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
bool except_flag)
Something doesn't seem right -- the function comment says we shouldn't
be calling the function for FOR ALL TABLES, but meanwhile, EXCEPT
TABLE is currently only implemented via FOR ALL TABLES. So it feels
contradictory. Maybe it is just the comment that needs updating?

~~~

5.
/*
* Gets list of relation oids for a publication that matches the except_flag.
*
* This should only be used FOR TABLE publications, the FOR ALL TABLES/SEQUENCES
* should use GetAllPublicationRelations().
*/
List *
GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
bool except_flag)
{
List *result;
Relation pubrelsrel;
ScanKeyData scankey;
SysScanDesc scan;
HeapTuple tup;

/* Find all publications associated with the relation. */
pubrelsrel = table_open(PublicationRelRelationId, AccessShareLock);

Existing bug? Isn't this a bogus comment?
/* Find all publications associated with the relation. */

Was that meant to be the other way around? -- e.g. Find all the
relations associated with the specified publication.

======
src/backend/commands/publicationcmds.c

6.
+ default:
+ /* shouldn't happen */
+ elog(ERROR, "invalid publication object type %d",
+ puballobj->pubobjtype);
+ break;

I think the ERROR is enough of a clue that it shouldn't happen. I felt
the comment was redundant.

~~~

ObjectsInPublicationToOids:

7.
case PUBLICATIONOBJ_TABLE:
+ pubobj->pubtable->except = false;
+ *rels = lappend(*rels, pubobj->pubtable);
+ break;
+ case PUBLICATIONOBJ_EXCEPT_TABLE:
+ pubobj->pubtable->except = true;
*rels = lappend(*rels, pubobj->pubtable);
break;
Those are very similar. How about combining like below?

case PUBLICATIONOBJ_TABLE:
case PUBLICATIONOBJ_EXCEPT_TABLE:
pubobj->pubtable->except = (pubobj->pubobjtype ==
PUBLICATIONOBJ_EXCEPT_TABLE);
*rels = lappend(*rels, pubobj->pubtable);
break;

~~

pub_contains_invalid_column:

8.
pub_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
bool pubviaroot, char pubgencols_type,
- bool *invalid_column_list,
+ bool puballtables, bool *invalid_column_list,
bool *invalid_gen_col)

The 'pub_via_root' and 'pubgencols_type' are parameters. Somehow it
seems more natural for the 'puballtables' to be passed before those,
because FOR ALL TABLES comes before WITH in the syntax.

~~~

CreatePublication:

9.
else if (!stmt->for_all_sequences)
- {
ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations,
&schemaidlist);

AFAICT, this function is refactored a lot because of the removal of
that '{'. It looks like mostly whitespace, but really, I think the
logic is quite different. I wasn't sure what that was about. Is it
related to this patch, or some other bugfix in passing or what?

======
src/backend/commands/tablecmds.c

ATPrepChangePersistence:

10.
- GetRelationPublications(RelationGetRelid(rel)) != NIL)
+ list_length(GetRelationPublications(RelationGetRelid(rel), false)) > 0)

Isn't an empty List the same as a NIL list? Maybe that list_length()
change was not really needed.

======
src/backend/parser/gram.y

11.
drop_option_list pub_obj_list pub_all_obj_type_list
+ except_pub_obj_list opt_except_clause

Is this name consistent with the others? Should it be pub_except_obj_list?

~~~

12.
%type <publicationobjectspec> PublicationObjSpec
+%type <publicationobjectspec> ExceptPublicationObjSpec
%type <publicationallobjectspec> PublicationAllObjSpec

Is this name consistent with the others? Should it be PublicationExceptObjSpec?

~~~

CreatePublicationStmt:

13.
n->pubname = $3;
+ n->pubobjects = $5;

I noticed that sometimes there is a cast (List *) and other times
there is not. e.g. none here, but cast in AlterPublicationStmt. Why
the differences?

~~~

PublicationObjSpec:

14.
The comment for 'PublicationObjSpec' says "FOR TABLE and FOR TABLES IN
SCHEMA specifications". If that comment is correct, then why is this
patch changing this code? OTOH, if the code is correct, then does the
comment need updating?

======
src/bin/pg_dump/pg_dump.c

15.
Shouldn't there have already been some ALTER ... ADD ALL TABLE dump
code and test code implemented back in patch 0002?

~~~

dumpPublication:

16.
else if (pubinfo->puballtables)
+ {
+ SimplePtrListCell *cell;
+
appendPQExpBufferStr(query, " FOR ALL TABLES");
+
+ /* Include exception tables if the publication has except tables */
+ for (cell = exceptinfo.head; cell; cell = cell->next)
+ {
+ PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
+ TableInfo *tbinfo;
+
+ if (pubinfo == pubrinfo->publication)
+ {
+ tbinfo = pubrinfo->pubtable;
+
+ if (first)
+ {
+ appendPQExpBufferStr(query, " EXCEPT TABLE (");
+ first = false;
+ }
+ else
+ appendPQExpBufferStr(query, ", ");
+ appendPQExpBuffer(query, "ONLY %s", fmtQualifiedDumpable(tbinfo));
+ }
+ }
+ if (!first)
+ appendPQExpBufferStr(query, ")");
+ }

16a.
SimplePtrListCell *cell can be declared as a for-loop variable.

~

16b.
The comment should say "EXCEPT TABLES" in uppercase.

~

16c.
I am not convinced you can use that 'first' flag like you are doing.
Isn't that interfering with the existing usage of that flag? Perhaps
another boolean just for this EXCEPT loop is needed.

~~~

getPublicationTables:

17.
+ if (strcmp(prexcept, "f") == 0)
+ pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
+ else
+ pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
+

...

+ if (strcmp(prexcept, "t") == 0)
+ simple_ptr_list_append(&exceptinfo, &pubrinfo[j]);
+

Here you are comparing the same 'prexcept' flag for both "f" and "t".

I felt it was better if both comparisons are the same (e.g. both "t").

Or better still, assign a new boolean and avoid that 2nd strcmp
entirely -- e.g. except_flag = (strcmp(prexcept, "t") == 0);

======
src/bin/pg_dump/pg_dump_sort.c

DOTypeNameCompare:

18.
+ else if (obj1->objType == DO_PUBLICATION_EXCEPT_REL)
+ {
+ PublicationRelInfo *probj1 = *(PublicationRelInfo *const *) p1;
+ PublicationRelInfo *probj2 = *(PublicationRelInfo *const *) p2;
+
+ /* Sort by publication name, since (namespace, name) match the rel */
+ cmpval = strcmp(probj1->publication->dobj.name,
+ probj2->publication->dobj.name);
+ if (cmpval != 0)
+ return cmpval;
+ }

Isn't this identical to the previous code block? So can't you just add
DO_PUBLICATION_EXCEPT_REL to that condition?

======
src/bin/pg_dump/t/002_pg_dump.pl

19.
Missing test cases for ALTER? But also.

~~~

20.
Missing test cases for EXCEPT for INHERITED tables?

======
src/bin/psql/describe.c

describeOneTableDetails:

21.
I was wondering if the "describe" for tables (e.g. \d+) should also
show the publications where the table is an ECEPT TABLE? How else is
the user going to know it has been excluded by some publication?

======
src/bin/psql/tab-complete.in.c

ALTER PUBLICATION:

22.
The tab completion does not seem as good as it could be. e.g, there is
missing '(' and the for EXCEPT TABLE

~~~

CREATE PUBLICATION:

23.
The tab completion does not seem as good as it could be. e.g, there is
missing '(' and the for EXCEPT TABLE

======
src/test/regress/sql/publication.sql

24.
+\dRp+ testpub_foralltables_excepttable
+\dRp+ testpub_foralltables_excepttable1

As well as doing the "describes" for the publication, I think we need
to see the test cases for the describes of those excluded tables. e.g.
I imagine that they should also list the publications that they are
*excluded* from, right?

~~~

25.
+CREATE PUBLICATION testpub5 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3);
+CREATE PUBLICATION testpub6 FOR ALL TABLES EXCEPT TABLE (ONLY testpub_tbl3);

25a.
Needs some explanatory comments here saying these are for testing the
EXCEPT with inherited tables (e.g. ONLY versus not).

~

25b.
I think you should be testing the '*' syntax here too.

~~~

26.
+CREATE TABLE pub_sch1.tbl2 (a int);
SET client_min_messages = 'ERROR';
CREATE PUBLICATION testpub_reset FOR ALL TABLES, ALL SEQUENCES;
RESET client_min_messages;
@@ -1344,9 +1358,15 @@ ALTER PUBLICATION testpub_reset ADD ALL TABLES;

-- Can't add ALL TABLES to 'ALL TABLES' publication
ALTER PUBLICATION testpub_reset ADD ALL TABLES;
+ALTER PUBLICATION testpub_reset RESET;
+
+-- Verify adding EXCEPT TABLE
+ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE
(pub_sch1.tbl1, pub_sch1.tbl2);
+\dRp+ testpub_reset

DROP PUBLICATION testpub_reset;
DROP TABLE pub_sch1.tbl1;
+DROP TABLE pub_sch1.tbl2;
DROP SCHEMA pub_sch1;

It looks like that added CREATE TABLE (and RESET?) belongs more
appropriately within the scope of the new test "Verify adding EXCEPT
TABLE".

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2025-11-24 07:48:24 Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache
Previous Message Sergei Glukhov 2025-11-24 07:32:18 Partial hash index is not used for implied qual.