From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | Shubham Khanna <khannashubham1197(at)gmail(dot)com> |
Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Add support for specifying tables in pg_createsubscriber. |
Date: | 2025-09-17 23:53:22 |
Message-ID: | CAHut+PtAJDqgkeUb7x2m9CApxFbqmLpNVHwbjQH-iecBrXEJXA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Shubham,
Here are some v9 review comments.
======
doc/src/sgml/ref/pg_createsubscriber.sgml
--clean:
1.
+ <para>
+ When publications are reused, they will not be dropped during cleanup
+ operations, ensuring they remain available for other uses.
+ Only publications created by
+ <application>pg_createsubscriber</application> on the target server
+ will be cleaned up if the operation fails. Publications on the
+ publisher server are never modified or dropped by cleanup operations.
+ </para>
1a.
OK, I understand now that you want the --clean switch to behave the
same as before and remove all publications, except now it will *not*
drop any publications that are being reused.
I can imagine how that might be convenient to keep those publications
around if the subscriber ends up being promoted to a primary node.
OTOH, it seems a bit peculiar that the behaviour of the --clean option
is now depending on the other --publication option.
I wonder what others think about this feature/quirk? e.g. maybe you
need to introduce a --clean=unused_publications?
~
1b.
Anyway, even if this behaviour is what you wanted, that text
describing --clean needs rewording.
Before this paragraph, the docs still say --clean=publications:
"...causes all other publications replicated from the source server to
be dropped as well."
Which is then immediately contradicted when you wrote:
"When publications are reused, they will not be dropped"
~~~
--publication:
2.
+ <para>
+ Use <option>--dry-run</option> to see which publications will be reused
+ and which will be created before running the actual command.
+ </para>
It seemed a bit strange to say "the actual command" because IMO it's
always an actual command regardless of the options.
SUGGEST:
Use --dry-run to safely preview which publications will be reused and
which will be created.
======
src/bin/pg_basebackup/pg_createsubscriber.c
check_and_drop_publications:
3.
The function comment has not been updated with the new rules. It still
says, "Additionally, if requested, drop all pre-existing
publications."
~~~
4.
/* Drop each publication */
for (int i = 0; i < PQntuples(res); i++)
- drop_publication(conn, PQgetvalue(res, i, 0), dbinfo->dbname,
- &dbinfo->made_publication);
-
+ {
+ if (dbinfo->made_publication)
+ drop_publication(conn, PQgetvalue(res, i, 0), dbinfo->dbname,
+ &dbinfo->made_publication);
+ }
PQclear(res);
}
/*
- * In dry-run mode, we don't create publications, but we still try to drop
- * those to provide necessary information to the user.
+ * Only drop publications that were created by pg_createsubscriber during
+ * this operation. Pre-existing publications are preserved.
*/
if (!drop_all_pubs || dry_run)
- drop_publication(conn, dbinfo->pubname, dbinfo->dbname,
- &dbinfo->made_publication);
+ {
+ if (dbinfo->made_publication)
+ drop_publication(conn, dbinfo->pubname, dbinfo->dbname,
+ &dbinfo->made_publication);
+ else
+ pg_log_info("preserving existing publication \"%s\" in database \"%s\"",
+ dbinfo->pubname, dbinfo->dbname);
+ }
4a.
I always find it difficult to follow the logic of this function.
AFAICT the "dry_mode" logic is already handled within the
drop_publication(), so check_and_drop_publications() can be simplified
by refactoring it:
CURRENT
if (drop_all_pubs)
{
...
}
if (!drop_all_pub || dry_mode)
{
...
}
SUGGEST
if (drop_all_pubs)
{
...
}
else
{
...
}
~
4b.
Why is "preserving existing publication" logged in a --dry-run only?
Shouldn't you see the same logs regardless of --dry-run? That's kind
of the whole point of a dry run, right?
======
.../t/040_pg_createsubscriber.pl
5.
+my $node_s3 = PostgreSQL::Test::Cluster->new('node_s3');
+$node_s3->init_from_backup($node_p, 'backup_tablepub', has_streaming => 1);
+
Should this be done later, where it is needed, combined with the
node_s3->start/stop?
~~~
6.
+# Run pg_createsubscriber on node S3 without --dry-run and --clean option
+# to verify that the existing publications are preserved.
+command_ok(
+ [
+ 'pg_createsubscriber',
+ '--verbose', '--verbose',
+ '--recovery-timeout' => $PostgreSQL::Test::Utils::timeout_default,
+ '--pgdata' => $node_s3->data_dir,
+ '--publisher-server' => $node_p->connstr($db1),
+ '--socketdir' => $node_s3->host,
+ '--subscriber-port' => $node_s3->port,
+ '--database' => $db1,
+ '--database' => $db2,
+ '--publication' => 'test_pub_existing',
+ '--publication' => 'test_pub_new',
+ '--clean' => 'publications',
+ ],
+ 'run pg_createsubscriber on node S3');
6a.
That comment wording "without --dry-run and --clean option" is
confusing because it sounds like "without --dry-run" and "without
--clean"
~
6b.
There are other untested combinations like --dry-run with --clean,
which would give more confidence about the logic of function
check_and_drop_publications().
Perhaps this test could have been written using --dry-run, and you
could be checking the logs for the expected "drop" or "preserving"
messages.
~~~
7.
+# Confirm publication created by pg_createsubscriber is removed
+is( $node_s3->safe_psql(
+ $db1,
+ "SELECT COUNT(*) FROM pg_publication WHERE pubname = 'test_pub_new';"
+ ),
+ '0',
+ 'publication created by pg_createsubscriber have been removed');
+
Hmm. Here you are testing the new publication was deleted, so nowhere
is testing what the earlier comment said it was doing "...verify that
the existing publications are preserved.".
======
Kind Regards,
Peter Smith
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2025-09-18 00:00:50 | Re: Parallel heap vacuum |
Previous Message | Michael Paquier | 2025-09-17 23:32:22 | Re: PgStat_HashKey padding issue when passed by reference |