From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org> |
Subject: | Re: Logical Replication of sequences |
Date: | 2025-10-20 22:18:50 |
Message-ID: | CAD21AoDNT4uTDVwg76gZXn9xOX4mNCg-ZVFJHbPdDVvqS8V6PQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Oct 20, 2025 at 4:59 AM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> > Here is the latest patch set which addressed Shveta[1], Amit[2], Chao[3][4],
> > Dilip[5], Sawada-San's[6] comments.
>
> I found the patch could not pass the sanity check, because 0001 missed a comma.
> Also, there was a duplicated entry for `REFRESH SEQUENCES`.
>
> Attached set fixed the issue.
>
Thank you for updating the patch! I have a few comments on the 0001 patch:
- appendStringInfo(&cmd, "SELECT DISTINCT n.nspname, c.relname,
gpt.attrs\n"
- " FROM pg_class c\n"
+ appendStringInfo(&cmd, "SELECT DISTINCT n.nspname, c.relname,
gpt.attrs");
+
+ if (support_relkind_seq)
+ appendStringInfo(&cmd, ", c.relkind\n");
+
+ appendStringInfo(&cmd, " FROM pg_class c\n"
" JOIN pg_namespace n ON n.oid =
c.relnamespace\n"
" JOIN ( SELECT
(pg_get_publication_tables(VARIADIC array_agg(pubname::text))).*\n"
" FROM pg_publication\n"
" WHERE pubname IN ( %s )) AS gpt\n"
" ON gpt.relid = c.oid\n",
pub_names->data);
I think we don't necessarily need to avoid getting relkind for servers
older than 19. IIUC the reason why we had to have check_columnlist was
that the attnames column was introduced to pg_publication_tables
catalog. But I think this patch is not the case since we get relkind
from pg_class. That is, I think we can get 4 columns from server >=16.
---
/*
- * Check and log a warning if the publisher has subscribed to the same table,
- * its partition ancestors (if it's a partition), or its partition children (if
- * it's a partitioned table), from some other publishers. This check is
- * required in the following scenarios:
+ * Check and log a warning if the publisher has subscribed to the same relation
+ * (table or sequence), its partition ancestors (if it's a partition), or its
+ * partition children (if it's a partitioned table), from some other
publishers.
+ * This check is required in the following scenarios:
*
* 1) For CREATE SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH PUBLICATION
* statements with "copy_data = true" and "origin = none":
* - Warn the user that data with an origin might have been copied.
- * - This check is skipped for tables already added, as incremental sync via
- * WAL allows origin tracking. The list of such tables is in
- * subrel_local_oids.
+ * - This check is skipped for tables and sequences already added, as
+ * incremental sync via WAL allows origin tracking. The list of
such tables
+ * is in subrel_local_oids.
*
* 2) For CREATE SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH PUBLICATION
* statements with "retain_dead_tuples = true" and "origin = any", and for
@@ -2338,13 +2440,19 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
* - Warn the user that only conflict detection info for local changes on
* the publisher is retained. Data from other origins may lack sufficient
* details for reliable conflict detection.
+ * - This check targets for tables only.
* - See comments atop worker.c for more details.
+ *
+ * 3) For ALTER SUBSCRIPTION ... REFRESH SEQUENCE statements with "origin =
+ * none":
+ * - Warn the user that sequence data from another origin might have been
+ * copied.
*/
While this function is well documented, I find it's quite complex, and
this patch adds to that complexity. The function has 9 arguments,
making it difficult to understand which combinations of arguments
enable which checks. For example, the function header comment doesn't
explain when to use the only_sequences parameter. At first, I thought
only_sequences should be set to true when checking if the publisher
has subscribed to sequences from other publishers, but looking at the
code, I discovered it doesn't check sequences when check_rdt is true:
+ if (walrcv_server_version(wrconn) < 190000 || check_rdt)
+ appendStringInfo(&cmd, query,
+ "(SELECT relid, TRUE as istable FROM
pg_get_publication_tables(P.pubname))");
+ else if (only_sequences)
+ appendStringInfo(&cmd, query,
+ "(SELECT relid, FALSE as istable FROM
pg_get_publication_sequences(P.pubname))");
+ else
+ appendStringInfo(&cmd, query,
+ "(SELECT relid, TRUE as istable FROM
pg_get_publication_tables(P.pubname) UNION ALL"
+ " SELECT relid, FALSE as istable FROM
pg_get_publication_sequences(P.pubname))");
+
I find that the complexity might stem from checking different cases in
one function, but I don't have better ideas to improve the logic for
now. I think we can at least describe what the caller can expect from
specifying only_sequence to true.
---
+ * apply workers initilization, and to handle origin creation dynamically
s/initilization/initialization/
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-10-20 22:23:27 | Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward |
Previous Message | Tom Lane | 2025-10-20 22:15:13 | Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward |