From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | Shubham Khanna <khannashubham1197(at)gmail(dot)com> |
Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Add support for specifying tables in pg_createsubscriber. |
Date: | 2025-09-11 02:31:29 |
Message-ID: | CAHut+PuHJNid6iV7+pPXH-Vb8ygSdb6WLxPHweQGrLh5rH7nYA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Shubham,
My first impression of this latest patch is that the new option name
is not particularly intuitive.
Below are some more review comments for patch v5-0001; any changes to
the name will propagate all through this with different member names
and usage, etc, so I did not repeat that same comment over and over.
======
Commit message
1.
Add a new '--if-not-exists' option to pg_createsubscriber, allowing users to
reuse existing publications if they are already present, or create them if they
do not exist.
~
IMO, this option name ("if-not-exists") gives the user no clues as to
its purpose.
I suppose you wanted the user to associate this option with a CREATE
PUBLICATION, so that they can deduce that it acts internally like a
"CREATE PUBLICATION pub ... IF NOT EXISTS" (if there was such a
thing), I don't see how a user is able to find any meaning at all from
this vague name. The only way they can know what this means is to read
all the documentation.
e.g.
- if WHAT doesn't exist?
- if <??> not exists, then do WHAT?
Alternatives like "--reuse-pub-if-exists" or "--reuse-existing-pubs"
might be easier to understand.
~~~
2.
Key features:
1. New '--if-not-exists' flag changes the behavior of '--publication'.
2. If the publication exists, it is reused.
3. If it does not exist, it is created automatically.
4. Supports per-database specification, consistent with other options.
5. Avoids the complexity of option conflicts and count-matching rules.
6. Provides semantics consistent with SQL’s IF NOT EXISTS syntax.
~
Some of those "features" seem wrong, and some seem inappropriate for
the commit message:
e.g. TBH, I doubt that "4. Supports per-database specification" can
work as claimed here. Supposing I have multiple databases, then I
cannot see how I can say "if-not-exists" for some databases but not
for others.
e.g. The "5. Avoids the complexity..." seems like a reason why an
alternative design was rejected; Why does this comment belong here?
e.g. The "6. Provides semantics..." seems like your internal
implementation justification, whereas IMO this patch should be more
focused on making any new command-line option more
intuitive/meaningful from the point of view of the user.
======
doc/src/sgml/ref/pg_createsubscriber.sgml
3.
+ <para>
+ This option provides flexibility for mixed scenarios where some
+ publications may already exist while others need to be created.
+ It eliminates the need to know in advance which publications exist on
+ the publisher.
+ </para>
Hmm. Is that statement ("It eliminates the need to know...") correct?
I feel it should say the total opposite of this.
For example, IMO now the user needs to be extra careful because if
they say --publication=pub1 --if-not-exists then they have to be 100%
sure if 'pub1' exists or does not exist, otherwise they might
accidentally be making pub1 FOR ALL TABLES when really they expected
they were reusing some existing publication 'pub1', or vice versa.
In fact, I think the docs should go further and *recommend* that the
user never uses this new option without firstly doing a --dry-run to
verify they are actually getting the publications that they assume
they are getting.
~~~
4.
+ <para>
+ This option follows the same semantics as SQL
+ <literal>IF NOT EXISTS</literal> clauses, providing consistent behavior
+ with other PostgreSQL tools.
+ </para>
Why even say this in the docs? What other "tools" are you referring to?
======
src/bin/pg_basebackup/pg_createsubscriber.c
5.
bool made_publication; /* publication was created */
+ bool publication_existed; /* publication existed before we
+ * started */
};
This new member feels redundant to me.
AFAIK, the publication will either exist already, OR it will be
created/made by pg_createsubscriber. You don't need 2 booleans to
represent that.
At most, maybe all you want is another local variable in the function
setup_publisher().
~~~
cleanup_objects_atexit:
(same comments as posted in my v4 review)
6a.
- if (dbinfo->made_publication)
+ if (dbinfo->made_publication && !dbinfo->publication_existed)
Same as above review comment #5. It's either made or it's not made,
right? Why the extra boolean?
~
6b.
- if (dbinfo->made_publication)
+ if (dbinfo->made_publication && !dbinfo->publication_existed)
Ditto.
~~~
check_publication_exists:
7.
+/*
+ * Add function to check if publication exists.
+ */
Function comment should not say "Add function".
~
8.
+ PGresult *res;
+ bool exists = false;
+ char *query;
Redundant assignment?
~~~
setup_publisher:
9.
+ /*
+ * Check if publication already exists when --if-not-exists is
+ * specified
+ */
+ if (opt->if_not_exists)
+ {
+ dbinfo[i].publication_existed = check_publication_exists(conn,
dbinfo[i].pubname, dbinfo[i].dbname);
+ if (dbinfo[i].publication_existed)
+ {
+ pg_log_info("using existing publication \"%s\" in database \"%s\"",
+ dbinfo[i].pubname, dbinfo[i].dbname);
+ }
+ }
+
/*
* Create publication on publisher. This step should be executed
* *before* promoting the subscriber to avoid any transactions between
* consistent LSN and the new publication rows (such transactions
* wouldn't see the new publication rows resulting in an error).
*/
- create_publication(conn, &dbinfo[i]);
+ if (opt->if_not_exists && dbinfo[i].publication_existed)
+ dbinfo[i].made_publication = false;
+ else
+ {
+ create_publication(conn, &dbinfo[i]);
+ if (opt->if_not_exists)
+ {
+ pg_log_info("created publication \"%s\" in database \"%s\"",
+ dbinfo[i].pubname, dbinfo[i].dbname);
+ }
+ }
This all seems way more complicated than it needs to be. e.g. I doubt
that you need to be checking opt->if_not_exists 3 times.
Simpler logic might be more like below:
bool make_pub = true;
if (opt->if_not_exists && check_publication_exists(...))
{
pg_log_info("using existing...");
make_pub = false;
}
...
if (make_pub)
{
create_publication(conn, &dbinfo[i]);
pg_log_info("created publication...");
}
dbinfo[i].made_publication = make_pub;
======
.../t/040_pg_createsubscriber.pl
10.
+# Initialize node_s2 as a fresh standby of node_p for table-level
+# publication test.
I don't know that you should still be calling this a "table-level"
test. Maybe it's more like an "existing publication" test now?
~~~
11.
+is( $result, qq({test_pub_existing}
+{test_pub_new}),
+ "subscriptions use the correct publications with --if-not-exists in
$db1 and $db2"
+);
Something seems broken. I deliberately caused an error to occur to see
what would happen, and the substitutions of $db1 and $db2 went crazy.
# Failed test 'subscriptions use the correct publications with
--if-not-exists in regression\"\
¬ !"\#$%&'()*+,-\\"\\\
and regression./0123456789:;<=>?(at)ABCDEFGHIJKLMNOPQRSTUVWXYZ'
# at t/040_pg_createsubscriber.pl line 614.
======
Kind Regards,
Peter Smith
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2025-09-11 02:49:39 | Re: someone else to do the list of acknowledgments |
Previous Message | Richard Guo | 2025-09-11 02:16:29 | Re: expand virtual generated columns in get_relation_constraints() |