Quick Links

Re: Skipping schema changes in publication

From:	Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
To:	Peter Smith <smithpb2250(at)gmail(dot)com>
Cc:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, YeXiu <1518981153(at)qq(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Skipping schema changes in publication
Date:	2025-08-13 09:21:42
Message-ID:	CANhcyEW+uJB_bvQLEaZCgoRTc1=i+QnrPPHxZ2=0SBSCyj9pkg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, 11 Aug 2025 at 13:55, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Shlok.
>
> On Wed, Aug 6, 2025 at 11:11 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> >
> ...
> > > 5.
> > > Bitmapset *cols = NULL;
> > > + bool except_columns = false;
> > > + bool no_col_published = false;
> > >
> > > There are multiple places in this patch that say:
> > >
> > > 'no_col_published'
> > > or 'no_cols_published'
> > >
> > > I felt this var name can be misunderstood because it is easy to read
> > > "no" as meaning "no." (aka number), and then misinterpret as
> > > "number_of_cols_published".
> > >
> > > Maybe an unambiguous name can be found, like
> > > - 'zero_cols_published' or
> > > - 'nothing_published' or
> > > - really make it 'num_cols_published' and check for 0.
> > >
> > > (so this comment applies to multiple places in the patch)
> > >
> > How about 'all_cols_excluded'? Or 'has_published_cols'?
> > I have used 'all_cols_excluded' in this patch. Thoughts?
>
> The new name is good.
>
> > > ======
> > > src/bin/psql/describe.c
> > >
> > > describeOneTableDetails:
> > >
> > > 7.
> > > /* column list (if any) */
> > > if (!PQgetisnull(result, i, 2))
> > > - appendPQExpBuffer(&buf, " (%s)",
> > > - PQgetvalue(result, i, 2));
> > > + {
> > > + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> > > + appendPQExpBuffer(&buf, " EXCEPT (%s)",
> > > + PQgetvalue(result, i, 2));
> > > + else
> > > + appendPQExpBuffer(&buf, " (%s)",
> > > + PQgetvalue(result, i, 2));
> > > + }
> > >
> > > Isn't this code fragment (and also surrounding code) using the same
> > > logic as what is already encapsulated in the function
> > > addFooterToPublicationDesc()?
> > > Superficially, it seems like a large chunk can all be replaced with a
> > > single call to the existing function.
> > >
> > 'addFooterToPublicationDesc' is called when we use \dRp+ and print in format:
> > "schema_name.table_name" EXCEPT (column-list)
> > Whereas code pasted above is executed when we use \d+ table_name and
> > the output is the format:
> > "publication_name" EXCEPT (column-list)
> >
> > These pieces of code are used to print different info. One is used to
> > print info related to tables and the other is used to print info
> > related to publication.
> > Should we use a common function for this?
>
> It still seems like quite a lot of overlap. e.g. I thought there were
> ~30 lines common. OTOH, perhaps you'll need to pass another boolean to
> the function to indicate it is a "Publication:" footer. I guess you'd
> have to try it out first to see if the changes required to save those
> 30 LOC are worthwhile or not.
>
I have added the code changes for the same in this patch.

> >
> > > ======
> > > src/test/regress/expected/publication.out
> > >
> > > 8.
> > > +-- Syntax error EXCEPT without a col-list
> > > +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT;
> > > +ERROR: EXCEPT clause not allowed for table without column list
> > > +LINE 1: CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except...
> > > + ^
> > >
> > > Is that a bad syntax position marker (^)? e.g. Why is it pointed at
> > > the word "TABLE" instead of "EXCEPT"?
> > >
> > In function 'preprocess_pubobj_list' the position of position marker
> > (^) is decided by "pubobj->location". Function handles multiple errors
> > and setting "$$->location" only specific to EXCEPT qualifier would not
> > be appropriate. One solution I feel is to not show "position marker
> > (^)" in the case of EXCEPT. Or maybe we can add a new variable to
> > 'PublicationTable' for except_location but I think we should not do
> > that. Thoughts?
>
> In the review comments below, I suggest putting this location back,
> but changing the message.
>
> >
> > For this version of patch, I have removed the "position marker (^)" in
> > the case of EXCEPT.
> >
>
> //////
>
> Here are my review comments for the patch v19-0003.
>
> ======
> 1. General - SGML tags in docs for table/column names.
>
> There is nothing to change just yet, but keep an eye on the thread
> [1], because if/when that gets pushed, then there will several tags
> in this patch for table/column names that will need to be updated for
> consistency.
>
Noted

> ======
> src/backend/catalog/pg_publication.c
>
> pg_get_publication_tables:
>
> 2.
> +
> + if (!nulls[2])
> + {
> + Datum exceptDatum;
> + bool isnull;
> +
> + /*
> + * We fetch pubtuple if publication is not FOR ALL TABLES and
> + * not FOR TABLES IN SCHEMA. So if prexcept is true, it
> + * indicates that prattrs contains columns to be excluded for
> + * replication.
> + */
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple,
> + Anum_pg_publication_rel_prexcept,
> + &isnull);
> +
> + if (!isnull && DatumGetBool(exceptDatum))
> + except_columns = pub_collist_to_bitmapset(NULL, values[2], NULL);
> + }
>
> Maybe this should be done a few lines earlier, to keep all the
> values[2]/nulls[2] code together, ahead of the values[3]/nulls[3]
> code. Indeed, there is lots of other values[2]/nulls[2] logic that
> comes later in this function, so maybe it is better to do all of that
> first, instead of mingling it with values[3]/nulls[3].
>
> ======
> src/backend/commands/publicationcmds.c
>
> pub_contains_invalid_column:
>
> 3.
> * 1. Ensures that all columns referenced in the REPLICA IDENTITY are covered
> - * by the column list. If any column is missing, *invalid_column_list is set
> + * by the column list and are not part of column list specified with EXCEPT.
> + * If any column is missing, *invalid_column_list is set
> * to true.
>
> Whitespace problem here; there is some tab instead of space in this comment.
>
> Also /part of column list/part of the column list/
>
> ~~~
>
> AlterPublicationTables:
>
> 4.
> bool isnull = true;
> Datum whereClauseDatum;
> Datum columnListDatum;
> + Datum exceptDatum;
>
> It's not necessary to have all these different Datum variables; they
> are only temporary storage. It might be simpler to use a single "Datum
> datum;" which is reused 3x.
>
> ~
>
> 5.
> + exceptDatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> + Anum_pg_publication_rel_prexcept,
> + &isnull);
> +
> + if (!isnull)
> + oldexcept = DatumGetBool(exceptDatum);
> +
>
> Isn't the 'prexcept' also used for EXCEPT TABLE as well as EXCEPT
> (column-list)? In other words, should the change to this function be
> done already in one of the earlier patches?
>
> ~
This code path is only executed when running ALTER PUBLICATION ... SET
TABLE and running this command on a ALL TABLES publication throws an
error due to check by function 'CheckAlterPublication' . And EXCEPT
TABLE can only be used for ALL TABLES publications, I think it doesn’t
need to be moved to the 0002 patch.

>
> 6.
> if (equal(oldrelwhereclause, newpubrel->whereClause) &&
> - bms_equal(oldcolumns, newcolumns))
> + bms_equal(oldcolumns, newcolumns) &&
> + oldexcept == newpubrel->except)
>
> The code comment about this code fragment should also mention EXCEPT.
>
> ======
> src/backend/parser/gram.y
>
> preprocess_pubobj_list:
>
> 7.
> + if (pubobj->pubtable && pubobj->pubtable->except &&
> + pubobj->pubtable->columns == NULL)
> + ereport(ERROR,
> + errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("EXCEPT clause not allowed for table without column list"));
> +
>
> Having the syntax error location (like before in v18) might be better,
> but since that location is associated with the TABLE, then the error
> message should also be reworded so the subject is the table.
>
> SUGGESTION
> errmsg("table without column list cannot use EXCEPT clause")
>
> ======
> src/bin/psql/describe.c
>
> describeOneTableDetails:
>
> 8.
> - if (pset.sversion >= 150000)
> + if (pset.sversion >= 190000)
> {
> printfPQExpBuffer(&buf,
> "SELECT pubname\n"
> " , NULL\n"
> " , NULL\n"
> + " , NULL\n"
> "FROM pg_catalog.pg_publication p\n"
> " JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
> " JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
> @@ -3038,35 +3039,62 @@ describeOneTableDetails(const char *schemaname,
> " pg_catalog.pg_attribute\n"
> " WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
> " ELSE NULL END) "
> + " , prexcept "
> "FROM pg_catalog.pg_publication p\n"
> " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
> " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> - "WHERE pr.prrelid = '%s'\n",
> - oid, oid, oid);
> -
> - if (pset.sversion >= 190000)
> - appendPQExpBufferStr(&buf, " AND NOT pr.prexcept\n");
> + "WHERE pr.prrelid = '%s' "
> + "AND c.relnamespace NOT IN (\n "
> + " SELECT pnnspid FROM\n"
> + " pg_catalog.pg_publication_namespace)\n"
>
> - appendPQExpBuffer(&buf,
> "UNION\n"
> "SELECT pubname\n"
> " , NULL\n"
> " , NULL\n"
> + " , NULL\n"
> "FROM pg_catalog.pg_publication p\n"
> - "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n",
> - oid);
> -
> - if (pset.sversion >= 190000)
> - appendPQExpBuffer(&buf,
> - " AND NOT EXISTS (\n"
> - " SELECT 1\n"
> - " FROM pg_catalog.pg_publication_rel pr\n"
> - " JOIN pg_catalog.pg_class pc\n"
> - " ON pr.prrelid = pc.oid\n"
> - " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n",
> - oid);
> -
> - appendPQExpBufferStr(&buf, "ORDER BY 1;");
> + "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> + " AND NOT EXISTS (\n"
> + " SELECT 1\n"
> + " FROM pg_catalog.pg_publication_rel pr\n"
> + " JOIN pg_catalog.pg_class pc\n"
> + " ON pr.prrelid = pc.oid\n"
> + " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n"
> + "ORDER BY 1;",
> + oid, oid, oid, oid, oid);
> + }
> + else if (pset.sversion >= 150000)
> + {
> + printfPQExpBuffer(&buf,
> + "SELECT pubname\n"
> + " , NULL\n"
> + " , NULL\n"
> + "FROM pg_catalog.pg_publication p\n"
> + " JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid\n"
> + " JOIN pg_catalog.pg_class pc ON pc.relnamespace = pn.pnnspid\n"
> + "WHERE pc.oid ='%s' and pg_catalog.pg_relation_is_publishable('%s')\n"
> + "UNION\n"
> + "SELECT pubname\n"
> + " , pg_get_expr(pr.prqual, c.oid)\n"
> + " , (CASE WHEN pr.prattrs IS NOT NULL THEN\n"
> + " (SELECT string_agg(attname, ', ')\n"
> + " FROM pg_catalog.generate_series(0,
> pg_catalog.array_upper(pr.prattrs::pg_catalog.int2[], 1)) s,\n"
> + " pg_catalog.pg_attribute\n"
> + " WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n"
> + " ELSE NULL END) "
> + "FROM pg_catalog.pg_publication p\n"
> + " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
> + " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n"
> + "WHERE pr.prrelid = '%s'\n"
> + "UNION\n"
> + "SELECT pubname\n"
> + " , NULL\n"
> + " , NULL\n"
> + "FROM pg_catalog.pg_publication p\n"
> + "WHERE p.puballtables AND pg_catalog.pg_relation_is_publishable('%s')\n"
> + "ORDER BY 1;",
> + oid, oid, oid, oid);
>
> I found these large SQL selects with 3x UNIONs are difficult to read.
> Maybe you can add more comments to describe the intention of each of
> the UNION SELECTs?
>
> ~~~
>
> 9.
> /* column list (if any) */
> if (!PQgetisnull(result, i, 2))
> - appendPQExpBuffer(&buf, " (%s)",
> - PQgetvalue(result, i, 2));
> + {
> + if (strcmp(PQgetvalue(result, i, 3), "t") == 0)
> + appendPQExpBuffer(&buf, " EXCEPT");
> + appendPQExpBuffer(&buf, " (%s)", PQgetvalue(result, i, 2));
> + }
>
> I did not find any regression test case where the "EXCEPT" col-list is
> getting output for a "Publications:" footer.
>
> ======
> [1] https://www.postgresql.org/message-id/aIELRMAviNiUL1ie%40momjian.us
>

I have addressed the comments and the changes in v20 patch.

Thanks,
Shlok Kyal

Attachment	Content-Type	Size
v20-0001-Add-RESET-clause-to-Alter-Publication-which-will.patch	application/octet-stream	20.4 KB
v20-0003-Skip-publishing-the-columns-specified-in-FOR-TAB.patch	application/octet-stream	75.0 KB
v20-0002-Skip-publishing-the-tables-specified-in-EXCEPT-T.patch	application/octet-stream	79.0 KB

In response to

Re: Skipping schema changes in publication at 2025-08-11 08:25:12 from Peter Smith

Responses

Re: Skipping schema changes in publication at 2025-08-15 00:53:10 from Peter Smith

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrey Borodin	2025-08-13 09:52:04	Re: Fix for typo in UUIDv7 timestamp extraction
Previous Message	Chao Li	2025-08-13 09:16:01	Re: Improve hash join's handling of tuples with null join keys