| From: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
|---|---|
| To: | Peter Smith <smithpb2250(at)gmail(dot)com> |
| Cc: | vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, YeXiu <1518981153(at)qq(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Skipping schema changes in publication |
| Date: | 2025-12-09 17:48:48 |
| Message-ID: | CANhcyEWt0h-tqDZzQOhHXiS+fo2Wn9NomtvXq8oy=JG3nVTM2g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, 24 Nov 2025 at 13:03, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Fri, Nov 21, 2025 at 5:55 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Shlok.
> >
> > Here are some review comments for your patch v28-0003 (EXCEPT TABLE ...).
> >
> > The review of this patch is a WIP. In this post I only looked at the test code.
> >
>
> Here are my remaining review comments for patch v28-0003 (EXCEPT TABLE ...).
>
> ======
> doc/src/sgml/ref/create_publication.sgml
>
> 1.
> -ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> ADD ALL TABLES
> +ALTER PUBLICATION <replaceable class="parameter">name</replaceable>
> ADD ALL TABLES [ EXCEPT [ TABLE ] ( <replaceable
> class="parameter">table_exception_object</replaceable> [, ... ] ) ]
>
> Why is that optional [TABLE] keyword needed?
>
> I know PostGres commands sometimes have "noise" words in the syntax so
> the command can be more English-like, but in this case, the
> publication is a FOR ALL *TABLES* anyway, so I am not sure what the
> benefit is of the user being able to say TABLE a 2nd time?
>
I think this feature can be extended to EXCEPT SCHEMA etc. So I think
it is necessary for clarity.
There is already a discussion [1].
> ======
> src/backend/catalog/pg_publication.c
>
> 2.
> + /*
> + * Check for partitions of partitioned table which are specified with
> + * EXCEPT clause and partitioned table is published with
> + * publish_via_partition_root = true.
> + */
>
> I think you can just say "partitions" or "table partitions", but
> "partitions of [a] partitioned table" seems overkill.
>
> Also, "... and partitioned table is published with
> publish_via_partition_root = true." seems too wordy. Isn't that just
> the same as "... and publish_via_partition_root = true"
>
> SUGGESTION
> Check for when the publication says "EXCEPT TABLE (partition)" but
> publish_via_partition_root = true.
>
> ~~~
>
Modified
> 3.
> -/* Gets list of publication oids for a relation */
> +/* Gets list of publication oids for a relation that matches the except_flag */
> List *
> -GetRelationPublications(Oid relid)
> +GetRelationPublications(Oid relid, bool except_flag)
> {
> List *result = NIL;
> CatCList *pubrellist;
> @@ -765,7 +791,8 @@ GetRelationPublications(Oid relid)
> HeapTuple tup = &pubrellist->members[i]->tuple;
> Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid;
>
> - result = lappend_oid(result, pubid);
> + if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept)
> + result = lappend_oid(result, pubid);
> }
>
> I was wondering if it might be better to return 2 lists from this
> function (e.g. an included-list, and an excluded-list) instead of
> passing the 'except_flag' like the current code. IIUC, you are mostly
> calling this function twice to get 2 lists anyway, but returning 2
> lists instead of 1, this function might be more efficient since it
> will only process the publication loop once.
>
Modified
> ~~~
>
> 4.
> /*
> * Gets list of relation oids for a publication that matches the except_flag.
> *
> * This should only be used FOR TABLE publications, the FOR ALL TABLES/SEQUENCES
> * should use GetAllPublicationRelations().
> */
> List *
> GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
> bool except_flag)
> Something doesn't seem right -- the function comment says we shouldn't
> be calling the function for FOR ALL TABLES, but meanwhile, EXCEPT
> TABLE is currently only implemented via FOR ALL TABLES. So it feels
> contradictory. Maybe it is just the comment that needs updating?
>
I thought more about this function and found that we can remove the
'except_flag' variable.
Since we can only use EXCEPT TABLE clause for ALL TABLES publication
and we cannot use FOR TABLE clause with ALL TABLES.
If for ALL TABLES publication we call this function, we will return an
except table list.
Else we will return a list of table to be included in publication.
I have added a comment to this behaviour.
> ~~~
>
> 5.
> /*
> * Gets list of relation oids for a publication that matches the except_flag.
> *
> * This should only be used FOR TABLE publications, the FOR ALL TABLES/SEQUENCES
> * should use GetAllPublicationRelations().
> */
> List *
> GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
> bool except_flag)
> {
> List *result;
> Relation pubrelsrel;
> ScanKeyData scankey;
> SysScanDesc scan;
> HeapTuple tup;
>
> /* Find all publications associated with the relation. */
> pubrelsrel = table_open(PublicationRelRelationId, AccessShareLock);
>
> Existing bug? Isn't this a bogus comment?
> /* Find all publications associated with the relation. */
>
> Was that meant to be the other way around? -- e.g. Find all the
> relations associated with the specified publication.
>
I think you are correct. I will create a separate thread for this change.
> ======
> src/backend/commands/publicationcmds.c
>
> 6.
> + default:
> + /* shouldn't happen */
> + elog(ERROR, "invalid publication object type %d",
> + puballobj->pubobjtype);
> + break;
>
> I think the ERROR is enough of a clue that it shouldn't happen. I felt
> the comment was redundant.
>
There are multiple similar occurrences. See functions
'ObjectsInPublicationToOids', in publicationcmds.c.
There are some occurrences in the openssl.c file as well. But I also
think this comment is redundant.
I have removed the comment.
> ~~~
>
> ObjectsInPublicationToOids:
>
> 7.
> case PUBLICATIONOBJ_TABLE:
> + pubobj->pubtable->except = false;
> + *rels = lappend(*rels, pubobj->pubtable);
> + break;
> + case PUBLICATIONOBJ_EXCEPT_TABLE:
> + pubobj->pubtable->except = true;
> *rels = lappend(*rels, pubobj->pubtable);
> break;
> Those are very similar. How about combining like below?
>
> case PUBLICATIONOBJ_TABLE:
> case PUBLICATIONOBJ_EXCEPT_TABLE:
> pubobj->pubtable->except = (pubobj->pubobjtype ==
> PUBLICATIONOBJ_EXCEPT_TABLE);
> *rels = lappend(*rels, pubobj->pubtable);
> break;
>
Modified
> ~~
>
> pub_contains_invalid_column:
>
> 8.
> pub_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
> bool pubviaroot, char pubgencols_type,
> - bool *invalid_column_list,
> + bool puballtables, bool *invalid_column_list,
> bool *invalid_gen_col)
>
> The 'pub_via_root' and 'pubgencols_type' are parameters. Somehow it
> seems more natural for the 'puballtables' to be passed before those,
> because FOR ALL TABLES comes before WITH in the syntax.
>
Modified
> ~~~
>
> CreatePublication:
>
> 9.
> else if (!stmt->for_all_sequences)
> - {
> ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations,
> &schemaidlist);
>
> AFAICT, this function is refactored a lot because of the removal of
> that '{'. It looks like mostly whitespace, but really, I think the
> logic is quite different. I wasn't sure what that was about. Is it
> related to this patch, or some other bugfix in passing or what?
>
This change is part of this patch.
This change is required to get a list of tables which are excluded
(for ALL TABLES publication).
I think the code related to schema should be inside the condition
'else if (!stmt->for_all_sequences)'
I have made the change for the same and also added a comment.
> ======
> src/backend/commands/tablecmds.c
>
> ATPrepChangePersistence:
>
> 10.
> - GetRelationPublications(RelationGetRelid(rel)) != NIL)
> + list_length(GetRelationPublications(RelationGetRelid(rel), false)) > 0)
>
> Isn't an empty List the same as a NIL list? Maybe that list_length()
> change was not really needed.
>
Modified
> ======
> src/backend/parser/gram.y
>
> 11.
> drop_option_list pub_obj_list pub_all_obj_type_list
> + except_pub_obj_list opt_except_clause
>
> Is this name consistent with the others? Should it be pub_except_obj_list?
>
I think pub_except_obj_list is consistent with others. Modified.
> ~~~
>
> 12.
> %type <publicationobjectspec> PublicationObjSpec
> +%type <publicationobjectspec> ExceptPublicationObjSpec
> %type <publicationallobjectspec> PublicationAllObjSpec
>
> Is this name consistent with the others? Should it be PublicationExceptObjSpec?
>
I agree. Modified.
> ~~~
>
> CreatePublicationStmt:
>
> 13.
> n->pubname = $3;
> + n->pubobjects = $5;
>
> I noticed that sometimes there is a cast (List *) and other times
> there is not. e.g. none here, but cast in AlterPublicationStmt. Why
> the differences?
>
This change is not required in the latest patch. Due to discussion in [2].
> ~~~
>
> PublicationObjSpec:
>
> 14.
> The comment for 'PublicationObjSpec' says "FOR TABLE and FOR TABLES IN
> SCHEMA specifications". If that comment is correct, then why is this
> patch changing this code? OTOH, if the code is correct, then does the
> comment need updating?
>
We have only added "$$->location = @1;" for PublicationObjSpec.
I define the location of the '^' indicator while throwing an error. I
think we don't need to update comments for it?
> ======
> src/bin/pg_dump/pg_dump.c
>
> 15.
> Shouldn't there have already been some ALTER ... ADD ALL TABLE dump
> code and test code implemented back in patch 0002?
>
This change is not required in the latest patch. Due to discussion in [2].
> ~~~
>
> dumpPublication:
>
> 16.
> else if (pubinfo->puballtables)
> + {
> + SimplePtrListCell *cell;
> +
> appendPQExpBufferStr(query, " FOR ALL TABLES");
> +
> + /* Include exception tables if the publication has except tables */
> + for (cell = exceptinfo.head; cell; cell = cell->next)
> + {
> + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
> + TableInfo *tbinfo;
> +
> + if (pubinfo == pubrinfo->publication)
> + {
> + tbinfo = pubrinfo->pubtable;
> +
> + if (first)
> + {
> + appendPQExpBufferStr(query, " EXCEPT TABLE (");
> + first = false;
> + }
> + else
> + appendPQExpBufferStr(query, ", ");
> + appendPQExpBuffer(query, "ONLY %s", fmtQualifiedDumpable(tbinfo));
> + }
> + }
> + if (!first)
> + appendPQExpBufferStr(query, ")");
> + }
>
> 16a.
> SimplePtrListCell *cell can be declared as a for-loop variable.
>
> ~
>
> 16b.
> The comment should say "EXCEPT TABLES" in uppercase.
>
> ~
>
> 16c.
> I am not convinced you can use that 'first' flag like you are doing.
> Isn't that interfering with the existing usage of that flag? Perhaps
> another boolean just for this EXCEPT loop is needed.
>
> ~~~
>
Modified
> getPublicationTables:
>
> 17.
> + if (strcmp(prexcept, "f") == 0)
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
> + else
> + pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
> +
>
> ...
>
> + if (strcmp(prexcept, "t") == 0)
> + simple_ptr_list_append(&exceptinfo, &pubrinfo[j]);
> +
>
> Here you are comparing the same 'prexcept' flag for both "f" and "t".
>
> I felt it was better if both comparisons are the same (e.g. both "t").
>
> Or better still, assign a new boolean and avoid that 2nd strcmp
> entirely -- e.g. except_flag = (strcmp(prexcept, "t") == 0);
>
Modified
> ======
> src/bin/pg_dump/pg_dump_sort.c
>
> DOTypeNameCompare:
>
> 18.
> + else if (obj1->objType == DO_PUBLICATION_EXCEPT_REL)
> + {
> + PublicationRelInfo *probj1 = *(PublicationRelInfo *const *) p1;
> + PublicationRelInfo *probj2 = *(PublicationRelInfo *const *) p2;
> +
> + /* Sort by publication name, since (namespace, name) match the rel */
> + cmpval = strcmp(probj1->publication->dobj.name,
> + probj2->publication->dobj.name);
> + if (cmpval != 0)
> + return cmpval;
> + }
>
> Isn't this identical to the previous code block? So can't you just add
> DO_PUBLICATION_EXCEPT_REL to that condition?
>
Modified
> ======
> src/bin/pg_dump/t/002_pg_dump.pl
>
> 19.
> Missing test cases for ALTER? But also.
>
This change is not required in the latest patch. Due to discussion in [2].
> ~~~
>
> 20.
> Missing test cases for EXCEPT for INHERITED tables?
>
> ======
> src/bin/psql/describe.c
>
> describeOneTableDetails:
>
> 21.
> I was wondering if the "describe" for tables (e.g. \d+) should also
> show the publications where the table is an ECEPT TABLE? How else is
> the user going to know it has been excluded by some publication?
>
I thought it would be sufficient to show only the list of
publications, the table is part of.
Users can check the excluded tables by checking the description of the
publication using \dRp+.
Will it be not sufficient?
I am not sure why we should show a list of publications which it is not part of?
Am I missing something thoughts?
> ======
> src/bin/psql/tab-complete.in.c
>
> ALTER PUBLICATION:
>
> 22.
> The tab completion does not seem as good as it could be. e.g, there is
> missing '(' and the for EXCEPT TABLE
>
> ~~~
>
Modified
> CREATE PUBLICATION:
>
> 23.
> The tab completion does not seem as good as it could be. e.g, there is
> missing '(' and the for EXCEPT TABLE
>
Modified
> ======
> src/test/regress/sql/publication.sql
>
> 24.
> +\dRp+ testpub_foralltables_excepttable
> +\dRp+ testpub_foralltables_excepttable1
>
> As well as doing the "describes" for the publication, I think we need
> to see the test cases for the describes of those excluded tables. e.g.
> I imagine that they should also list the publications that they are
> *excluded* from, right?
>
See Reply to comment 21.
> ~~~
>
> 25.
> +CREATE PUBLICATION testpub5 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3);
> +CREATE PUBLICATION testpub6 FOR ALL TABLES EXCEPT TABLE (ONLY testpub_tbl3);
>
> 25a.
> Needs some explanatory comments here saying these are for testing the
> EXCEPT with inherited tables (e.g. ONLY versus not).
>
> ~
>
> 25b.
> I think you should be testing the '*' syntax here too.
>
> ~~~
>
I agree. Made the changes.
> 26.
> +CREATE TABLE pub_sch1.tbl2 (a int);
> SET client_min_messages = 'ERROR';
> CREATE PUBLICATION testpub_reset FOR ALL TABLES, ALL SEQUENCES;
> RESET client_min_messages;
> @@ -1344,9 +1358,15 @@ ALTER PUBLICATION testpub_reset ADD ALL TABLES;
>
> -- Can't add ALL TABLES to 'ALL TABLES' publication
> ALTER PUBLICATION testpub_reset ADD ALL TABLES;
> +ALTER PUBLICATION testpub_reset RESET;
> +
> +-- Verify adding EXCEPT TABLE
> +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE
> (pub_sch1.tbl1, pub_sch1.tbl2);
> +\dRp+ testpub_reset
>
> DROP PUBLICATION testpub_reset;
> DROP TABLE pub_sch1.tbl1;
> +DROP TABLE pub_sch1.tbl2;
> DROP SCHEMA pub_sch1;
>
> It looks like that added CREATE TABLE (and RESET?) belongs more
> appropriately within the scope of the new test "Verify adding EXCEPT
> TABLE".
>
This change is not required in the latest patch. Due to discussion in [2].
I have addressed the comments and attached the latest patch.
As per suggestion by Shveta and Amit, I have omitted the patches 0001,
0002, and 0004 (as per [2]). Will post these patches once 0003 patch
is RFC.
The new 0001 patch is to support EXCEPT TABLE for CREATE PUBLICATION
.. FOR ALL TABLES syntax. I have attached in [3].
[1]: https://www.postgresql.org/message-id/CAA4eK1JEKs8qwwhRb1BCiMNduJ5ePUtFnTscrZt86UKWBkLxwg%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAA4eK1KZ1Sb0soHp3HH2htwJ3%3Dqka-eQjW35vOW3%2B4VeWw4VoQ%40mail.gmail.com
[3]: https://www.postgresql.org/message-id/CANhcyEXwLrQsec6g%2B1dqWTKyJQMQMh%3Dgetj28C%2BzLL14BjuumA%40mail.gmail.com
Thanks,
Shlok Kyal
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2025-12-09 17:49:14 | Re: pg_dump:qemu: uncaught target signal 7 (Bus error) - core dumped |
| Previous Message | Melanie Plageman | 2025-12-09 17:48:45 | Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access) |