Re: Skipping schema changes in publication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Skipping schema changes in publication
Date: 2022-05-31 06:20:48
Message-ID: CAHut+PtsyfwL3opi2u4gwak7nip66agq7rQynRBL97oXBjKkcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are my review comments for patch v7-0002.

======

1. doc/src/sgml/logical-replication.sgml

@@ -1167,8 +1167,9 @@ CONTEXT: processing remote data for replication
origin "pg_16395" during "INSER
<para>
To add tables to a publication, the user must have ownership rights on the
table. To add all tables in schema to a publication, the user must be a
- superuser. To create a publication that publishes all tables or
all tables in
- schema automatically, the user must be a superuser.
+ superuser. To add all tables to a publication, the user must be a superuser.
+ To create a publication that publishes all tables or all tables in schema
+ automatically, the user must be a superuser.
</para>

I felt that maybe this whole paragraph should be rearranged. Put the
"create publication" parts before the "alter publication" parts;
Re-word the sentences more similarly. I also felt the ALL TABLES and
ALL TABLES IN SCHEMA etc should be written uppercase/literal since
that is what was meant.

SUGGESTION
To create a publication using FOR ALL TABLES or FOR ALL TABLES IN
SCHEMA, the user must be a superuser. To add ALL TABLES or ALL TABLES
IN SCHEMA to a publication, the user must be a superuser. To add
tables to a publication, the user must have ownership rights on the
table.

~~~

2. doc/src/sgml/ref/alter_publication.sgml

@@ -82,8 +88,8 @@ ALTER PUBLICATION <replaceable
class="parameter">name</replaceable> RESET

<para>
You must own the publication to use <command>ALTER PUBLICATION</command>.
- Adding a table to a publication additionally requires owning that table.
- The <literal>ADD ALL TABLES IN SCHEMA</literal>,
+ Adding a table to or excluding a table from a publication additionally
+ requires owning that table. The <literal>ADD ALL TABLES IN SCHEMA</literal>,
<literal>SET ALL TABLES IN SCHEMA</literal> to a publication and

Isn't this missing some information that says ADD ALL TABLES requires
the invoking user to be a superuser?

~~~

3. doc/src/sgml/ref/alter_publication.sgml - examples

+ <para>
+ Alter publication <structname>production_publication</structname> to publish
+ all tables except <structname>users</structname> and
+ <structname>departments</structname> tables:
+<programlisting>
+ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT users,
departments;
+</programlisting></para>
+

I didn't think it needs to say "tables" 2x (e.g. remove the last "tables")

~~~

4. doc/src/sgml/ref/create_publication.sgml - examples

+ <para>
+ Create a publication that publishes all changes in all the tables except for
+ the changes of <structname>users</structname> and
+ <structname>departments</structname> tables:
+<programlisting>
+CREATE PUBLICATION mypublication FOR ALL TABLES EXCEPT users, departments;
+</programlisting>
+ </para>

I didn't think it needs to say "tables" 2x (e.g. remove the last "tables")

~~~

5. src/backend/catalog/pg_publication.c

foreach(lc, ancestors)
{
Oid ancestor = lfirst_oid(lc);
- List *apubids = GetRelationPublications(ancestor);
- List *aschemaPubids = NIL;
+ List *apubids = GetRelationPublications(ancestor, false);
+ List *aschemapubids = NIL;
+ List *aexceptpubids = NIL;

level++;

- if (list_member_oid(apubids, puboid))
+ /* check if member of table publications */
+ if (!list_member_oid(apubids, puboid))
{
- topmost_relid = ancestor;
-
- if (ancestor_level)
- *ancestor_level = level;
- }
- else
- {
- aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
- if (list_member_oid(aschemaPubids, puboid))
+ /* check if member of schema publications */
+ aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));
+ if (!list_member_oid(aschemapubids, puboid))
{
- topmost_relid = ancestor;
-
- if (ancestor_level)
- *ancestor_level = level;
+ /*
+ * If the publication is all tables publication and the table
+ * is not part of exception tables.
+ */
+ if (puballtables)
+ {
+ aexceptpubids = GetRelationPublications(ancestor, true);
+ if (list_member_oid(aexceptpubids, puboid))
+ goto next;
+ }
+ else
+ goto next;
}
}

+ topmost_relid = ancestor;
+
+ if (ancestor_level)
+ *ancestor_level = level;
+
+next:
list_free(apubids);
- list_free(aschemaPubids);
+ list_free(aschemapubids);
+ list_free(aexceptpubids);
}

I felt those negative (!) conditions and those goto are making this
logic hard to understand. Can’t it be simplified more than this? Even
just having another bool flag might help make it easier.

e.g. Perhaps something a bit like this (but add some comments)

foreach(lc, ancestors)
{
Oid ancestor = lfirst_oid(lc);
List *apubids = GetRelationPublications(ancestor);
List *aschemaPubids = NIL;
List *aexceptpubids = NIL;
bool set_top = false;
level++;

set_top = list_member_oid(apubids, puboid);
if (!set_top)
{
aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
set_top = list_member_oid(aschemaPubids, puboid);

if (!set_top && puballtables)
{
aexceptpubids = GetRelationPublications(ancestor, true);
set_top = !list_member_oid(aexceptpubids, puboid);
}
}
if (set_top)
{
topmost_relid = ancestor;

if (ancestor_level)
*ancestor_level = level;
}

list_free(apubids);
list_free(aschemapubids);
list_free(aexceptpubids);
}

------

6. src/backend/commands/publicationcmds.c - CheckPublicationDefValues

+/*
+ * Check if the publication has default values
+ *
+ * Check the following:
+ * a) Publication is not set with "FOR ALL TABLES"
+ * b) Publication is having default options
+ * c) Publication is not associated with schemas
+ * d) Publication is not associated with relations
+ */
+static bool
+CheckPublicationDefValues(HeapTuple tup)

I think Osumi-san already gave a review [1] about this same comment.

So I only wanted to add that it should not say "options" here:
"default options" -> "default publication parameter values"

~~~

7. src/backend/commands/publicationcmds.c - AlterPublicationSetAllTables

+#ifdef USE_ASSERT_CHECKING
+ Assert(!pubform->puballtables);
+#endif

Why is this #ifdef needed? Isn't that logic built into the Assert macro already?

~~~

8. src/backend/commands/publicationcmds.c - AlterPublicationSetAllTables

+ /* set ALL TABLES flag */

Use uppercase 'S' to match other comments.

~~~

9. src/backend/commands/publicationcmds.c - AlterPublication

+ if (!isdefault)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("adding ALL TABLES requires the publication to have default
publication options, no tables/schemas associated and ALL TABLES flag
should not be set"),
+ errhint("Use ALTER PUBLICATION ... RESET to reset the publication"));

IMO this errmsg text is not very good but I think Osumi-san [1] has
also given a review comment about the same errmsg.

So I only wanted to add that should not say "options" here:
"default publication options" -> "default publication parameter values"

~~~

10. src/backend/parser/gram.y

/*****************************************************************************
*
* ALTER PUBLICATION name SET ( options )
*
* ALTER PUBLICATION name ADD pub_obj [, ...]
*
* ALTER PUBLICATION name DROP pub_obj [, ...]
*
* ALTER PUBLICATION name SET pub_obj [, ...]
*
* ALTER PUBLICATION name RESET
*
* pub_obj is one of:
*
* TABLE table_name [, ...]
* ALL TABLES IN SCHEMA schema_name [, ...]
*
*****************************************************************************/

-

Should the above comment be updated to mention also ADD ALL TABLES
... EXCEPT [TABLE] ...

~~~

11. src/bin/pg_dump/pg_dump.c - dumpPublication

+ /* Include exception tables if the publication has except tables */
+ for (cell = exceptinfo.head; cell; cell = cell->next)
+ {
+ PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
+ PublicationInfo *relpubinfo = pubrinfo->publication;
+ TableInfo *tbinfo;
+
+ if (pubinfo == relpubinfo)

I am unsure if that variable 'relpubinfo' is of much use; it is only
used one time.

~~~

12. src/bin/pg_dump/t/002_pg_dump.pl

I think there should be more test cases here:

E.g.1. EXCEPT TABLE should also test a list of tables

E.g.2. EXCEPT with optional TABLE keyword ommitted

~~~

13. src/bin/psql/describe.c - question about the SQL

Since the new 'except' is a boolean column, wouldn't it be more
natural if all the SQL was treating it as one?

e.g. should the SQL be saying "IS preexpect", "IS NOT prexcept";
instead of comparing preexpect to 't' and 'f' character.

~~~

14. .../t/032_rep_changes_except_table.pl

+# Test replication with publications created using FOR ALL TABLES EXCEPT TABLE
+# option.
+# Create schemas and tables on publisher

"option" -> "clause"

------
[1] https://www.postgresql.org/message-id/TYCPR01MB83730A2F1D6A5303E9C1416AEDD99%40TYCPR01MB8373.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Sitnikov 2022-05-31 07:16:35 PostgreSQL Limits: maximum number of columns in SELECT result
Previous Message David Rowley 2022-05-31 05:39:51 Re: PG15 beta1 sort performance regression due to Generation context change