RE: Logical Replication of sequences

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'vignesh C' <vignesh21(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: RE: Logical Replication of sequences
Date: 2025-07-28 07:56:18
Message-ID: OSCPR01MB14966C5FBEBB10D70445BC737F55AA@OSCPR01MB14966.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Vignesh,

Here are remained comments for v20250723 0003-0005. I've not checked the latest version.

01.
```
* PostgreSQL logical replication: common synchronization code
```

How about: "Common code for synchronizations"? Since this file locates in
replication/logical, initial part is bit trivial.

02.

How do you feel to separate header file into syncutils.h file? We can put some
definitions needed for synchronizations.

03.
```
-extern void getSubscriptionTables(Archive *fout);
+extern void getSubscriptionRelations(Archive *fout);
```

Assuming that this function obtains both tables and sequences. I'm now wondering
we can say "relation" for both the tables and sequences in the context. E.g.,
getTableData()->makeTableDataInfo() seems to obtain both table and sequence data,
and dumpSequenceData() dumps sequences. How about keep getSubscriptionTables, or
getSubscriptionTablesAndSequences?

04.
```
/*
* dumpSubscriptionTable
* Dump the definition of the given subscription table mapping. This will be
* used only in binary-upgrade mode for PG17 or later versions.
*/
static void
dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
```

If you rename getSubscriptionTables, dumpSubscriptionTable should be also renamed.

05.
```
/*
* Gets list of all relations published by FOR ALL SEQUENCES publication(s).
*/
List *
GetAllSequencesPublicationRelations(void)
```

It looks very similar with GetAllTablesPublicationRelations(). Can we combine them?
I feel we can pass the kind of target relation and pubviaroot.

06.
```
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -27,6 +27,7 @@
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
+#include "utils/memutils.h"
```

I can build without the header.

07.
```
+ /*
+ * XXX: If the subscription is for a sequence-only publication, creating a
+ * replication origin is unnecessary because incremental synchronization
+ * of sequences is not supported, and sequence data is fully synced during
+ * a REFRESH, which does not rely on the origin. If the publication is
+ * later modified to include tables, the origin can be created during the
+ * ALTER SUBSCRIPTION ... REFRESH command.
+ */
ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname, sizeof(originname));
replorigin_create(originname);
```

The comment is bit misleading because currently we create the replicaton origin
in the case. Can you clarify the point like:
```
XXX: Now the replication origin is created for all the cases, but it is unnecessary
when the subcription is for a sequence-only publicaiton....
```

08.
```
+ * XXX: If the subscription is for a sequence-only publication,
+ * creating this slot is unnecessary. It can be created later
+ * during the ALTER SUBSCRIPTION ... REFRESH PUBLICATION or ALTER
+ * SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES command, if the
+ * publication is updated to include tables.
```

Same as above.

09.
```
+ * Assert copy_data is true.
+ * Assert refresh_tables is false.
+ * Assert refresh_sequences is true
```

IIUC assertions are rarely described the comment stop function. If you want to
add, we should say like:
```
In Assert enabled builds, we verify that parameters are passed correctly...
```

10.
```
+#ifdef USE_ASSERT_CHECKING
+ /* Sanity checks for parameter values */
+ if (resync_all_sequences)
+ Assert(copy_data && !refresh_tables && refresh_sequences);
+#endif
```

How about below, which does not require ifdef.

```
Assert(!resync_all_sequences ||
(copy_data && !refresh_tables && refresh_sequences));

```

11.
```
+ bool issequence;
+ bool istable;
```

Isn't it enough to use istable?

12.
```
+ /* Relation is either a sequence or a table */
+ issequence = get_rel_relkind(subrel->srrelid) == RELKIND_SEQUENCE;
```

How about adding an Assert() to ensure the relation is either of table or sequence?

13.
```
+ * not_ready:
+ * If getting tables and not_ready is false get all tables, otherwise,
+ * only get tables that have not reached READY state.
+ * If getting sequences and not_ready is false get all sequences,
+ * otherwise, only get sequences that have not reached READY state (i.e. are
+ * still in INIT state).
```
Two parts decribe mostly same point. How about:
If true, this function returns only the relations that are not in a ready state.
Otherwise returns all the relations of the subscription.

14.
```
+ char table_state;
```

It should be `relation_state`.

15.
```
+#define SEQ_LOG_CNT_INVALID 0
```

Can you add comment how we use it?

16.
```
+
+ TimestampTz last_seqsync_start_time;
```

I can't find the user of this attribute, is it needed?

17.
```
+ FetchRelationStates(&has_pending_sequences);
+ ProcessSyncingTablesForApply(current_lsn);
+ if (has_pending_sequences)
+ ProcessSyncingSequencesForApply();
```

IIUC we do not always call ProcessSyncingSequencesForApply() because it would acquire
the LW lock. Can you clariy it as comments?

18.
```
+ case WORKERTYPE_SEQUENCESYNC:
+ /* Should never happen. */
+ Assert(0);
```

Should we call elog(ERROR) instead of Assert(0) like another case?

19.
```
/* Find the leader apply worker and signal it. */
logicalrep_worker_wakeup(MyLogicalRepWorker->subid, InvalidOid);
```

Do we have to signal to the leader even when the sequence worker exits?

Best regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2025-07-28 08:01:38 Re: Extension security improvement: Add support for extensions with an owned schema
Previous Message Mircea Cadariu 2025-07-28 07:52:16 Re: Add os_page_num to pg_buffercache