From: | Shubham Khanna <khannashubham1197(at)gmail(dot)com> |
---|---|
To: | Peter Smith <smithpb2250(at)gmail(dot)com> |
Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Add support for specifying tables in pg_createsubscriber. |
Date: | 2025-09-24 06:37:38 |
Message-ID: | CAHv8Rj+isv4kXQOOGsGUL3ncAXSXOH7_GxR4v5DwfkUgBXt5vg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 23, 2025 at 1:22 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Shubham,
>
> Here are some v10 review comments.
>
> ======
> 1. GENERAL
>
> A couple of my review comments below (#4, #7) are about the tense of
> the info messages; My comments are trying to introduce some
> consistency in the patch but unfortunately I see there are already
> lots of existing messages where the tense is a muddle of past/present
> (e.g. "creating"/"could not create" instead of "created"/"could not
> create" or "creating"/"cannot create". Perhaps it's better to just
> ignore my pg_log_info comments and do whatever seems right for this
> patch, but also make another thread to fix the tense for all the
> messages in one go.
>
As of now, I’ve addressed your comments related to the pg_log_info
messages. If you still feel there are more cases that should be
corrected, please let me know and I can start a separate thread to
handle tense consistency across all such messages.
> ======
> Commit Message
>
> 2.
> When publications are reused, they are never dropped during cleanup operations,
> ensuring pre-existing publications remain available for other uses.
> Only publications created by pg_createsubscriber are cleaned up.
>
> ~
>
> AFAICT, the implemented behaviour has flip-flopped again regarding
> --clean; I think now --clean=publications is unchanged from master
> (e.g. --clean=publication will drop all publications).
>
> That's OK, but you need to ensure the commit message is saying the
> same thing. e.g. currently it says "cleanup operations" never drop
> reused publications, but what about "--clean=publications" -- that's a
> cleanup operation, isn't it?
>
Fixed.
> ======
> src/bin/pg_basebackup/pg_createsubscriber.c
>
> check_publication_exists:
>
> 3.
> +/*
> + * Check if a publication with the given name exists in the specified database.
> + * Returns true if it exists, false otherwise.
> + */
>
> It seems a verbose way of saying:
> Return whether a specified publication exists in the specified database.
>
Fixed.
> ~~~
>
> check_and_drop_publications:
>
> 4.
> + if (check_publication_exists(conn, dbinfo[i].pubname, dbinfo[i].dbname))
> + {
> + /* Reuse existing publication on publisher. */
> + pg_log_info("using existing publication \"%s\" in database \"%s\"",
> + dbinfo[i].pubname, dbinfo[i].dbname);
> + dbinfo[i].made_publication = false;
> + }
> + else
> + {
> + /*
> + * Create publication on publisher. This step should be executed
> + * *before* promoting the subscriber to avoid any transactions
> + * between consistent LSN and the new publication rows (such
> + * transactions wouldn't see the new publication rows resulting in
> + * an error).
> + */
> + create_publication(conn, &dbinfo[i]);
> + pg_log_info("created publication \"%s\" in database \"%s\"",
> + dbinfo[i].pubname, dbinfo[i].dbname);
> + dbinfo[i].made_publication = true;
> + }
>
> The tense of these messages seems inconsistent, and is also different
> from another nearby message: "create replication slot..."
>
> So, should these pg_log_info change to match? For example:
> "using existing publication" --> "use existing..."
> "created publication" --> "create publication..."
>
Fixed.
> ~~~
>
> check_and_drop_publications:
>
> 5.
> * Since the publications were created before the consistent LSN, they
> * remain on the subscriber even after the physical replica is
> * promoted. Remove these publications from the subscriber because
> - * they have no use. Additionally, if requested, drop all pre-existing
> - * publications.
> + * they have no use. If --clean=publications is specified, drop all existing
> + * publications in the database. Otherwise, only drop publications that were
> + * created by pg_createsubscriber during this operation, preserving any
> + * pre-existing publications.
> */
>
>
> This function comment seems overkill now that this function has
> evolved. e.g. That whole first part (before "If --clean...") seems
> like it would be better just as a comment within the function body
> 'else' block rather than needing to say anything in the function
> comment.
>
Fixed.
> ~~~
>
> 6.
> for (int i = 0; i < PQntuples(res); i++)
> drop_publication(conn, PQgetvalue(res, i, 0), dbinfo->dbname,
> &dbinfo->made_publication);
> -
> PQclear(res);
> This whitespace change is not needed for the patch.
>
Fixed.
> ~~~
>
> 7.
> + else
> + {
> + /*
> + * Only drop publications that were created by pg_createsubscriber
> + * during this operation. Pre-existing publications are preserved.
> + */
> + if (dbinfo->made_publication)
> + drop_publication(conn, dbinfo->pubname, dbinfo->dbname,
> + &dbinfo->made_publication);
> + else
> + pg_log_info("preserving existing publication \"%s\" in database \"%s\"",
> + dbinfo->pubname, dbinfo->dbname);
> + }
>
> For consistency with earlier review comment, maybe the message should
> be reworded:
> "preserving existing publication" --> "preserve existing publication"
Fixed.
> ======
> .../t/040_pg_createsubscriber.pl
>
> 8.
> +# Run pg_createsubscriber on node S2 without --dry-run
> +command_ok(
>
> and later...
>
> +# Run pg_createsubscriber on node S2 without --dry-run
> +command_ok(
>
> ~
>
> These are not very informative comments. It should say more about the
> purpose -- e.g. you are specifying --publication to reuse one
> publication and create another ... to demonstrate that yada yada...
>
Fixed.
> ~~~
>
> 9.
> +# Verify the existing publication is still there and unchanged
> +my $existing_pub_count = $node_p->safe_psql($db1,
> + "SELECT COUNT(*) FROM pg_publication WHERE pubname = 'test_pub_existing'"
> +);
> +is($existing_pub_count, '1',
> + 'existing publication remains unchanged after dry-run');
> +
> +# Verify no actual publications were created during dry-run
> +my $pub_count_after_dry_run = $node_p->safe_psql($db2,
> + "SELECT COUNT(*) FROM pg_publication WHERE pubname = 'test_pub_new'");
> +is($pub_count_after_dry_run, '0',
> + 'dry-run did not actually create publications');
> +
>
> Instead of 2 SQLs to check whether something exists and something else
> does not exist, can't you just have 1 SELECT to fetch all publication
> names, then you can deduce the same thing from the result.
>
In the attached patch, I have removed that test case. It’s not really
valid to combine multiple switches in this way, since using
--publication together with --clean=publications is effectively the
same as running pg_createsubscriber twice (once with each switch).
That makes the extra combination test redundant, so I have dropped it.
> ~~~
>
> 10.
> +# Run pg_createsubscriber on node S3 with --dry-run
> +($stdout, $stderr) = run_command(
>
> Again, it's not a very informative comment. It should say more about
> the purpose -- e.g. you are specifying --publication and
> --clean=publications at the same time ... to demonstrate yada yada...
>
Fixed.
> ~~~
>
> 11.
> +# Verify nothing was actually changed
> +my $existing_pub_still_exists = $node_p->safe_psql($db1,
> + "SELECT COUNT(*) FROM pg_publication WHERE pubname = 'test_pub_existing'"
> +);
> +is($existing_pub_still_exists, '1',
> + 'existing publication still exists after dry-run with --clean');
> +
> +my $new_pub_still_exists = $node_p->safe_psql($db2,
> + "SELECT COUNT(*) FROM pg_publication WHERE pubname = 'test_pub_new'");
> +is($new_pub_still_exists, '1',
> + 'pg_createsubscriber publication still exists after dry-run with --clean'
> +);
> +
>
> Isn't this another example of something that is easily verified with a
> single SQL instead of checking both publications separately?
>
Fixed.
> ~~~
>
> 12.
> # Run pg_createsubscriber on node S3 with --clean option to verify that the
> # existing publications are preserved.
> command_ok(
>
> Is that comment correct? I don't think so because --clean=publications
> is supposed to drop all publications, right?
>
Fixed.
> ~~~
>
> 13.
> +# Confirm ALL publications were removed (both existing and new)
> +is( $node_s3->safe_psql(
> + $db1,
> + "SELECT COUNT(*) FROM pg_publication WHERE pubname = 'test_pub_existing';"
> + ),
> + '0',
> + 'pre-existing publication was removed by --clean=publications');
> +
> +is( $node_s3->safe_psql(
> + $db2,
> + "SELECT COUNT(*) FROM pg_publication WHERE pubname = 'test_pub_new';"
> + ),
> + '0',
> + 'publication created by pg_createsubscriber was removed by
> --clean=publications'
> +);
> +
>
> Are 2x SELECT needed here? Can't you just have a single select to
> discover that there are zero publications?
>
Fixed.
The attached patch contains the suggested changes.
Thanks and regards,
Shubham Khanna.
Attachment | Content-Type | Size |
---|---|---|
v11-0001-Support-existing-publications-in-pg_createsubscr.patch | application/octet-stream | 12.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Chao Li | 2025-09-24 06:37:43 | Re: SQL:2023 JSON simplified accessor support |
Previous Message | John Naylor | 2025-09-24 06:32:47 | Re: Proposal for enabling auto-vectorization for checksum calculations |