Re: Improve pg_sync_replication_slots() to wait for primary to advance

From: Japin Li <japinli(at)hotmail(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance
Date: 2025-10-31 04:42:27
Message-ID: ME0P300MB0445D2A2B2096E2DDED19247B6F8A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 30 Oct 2025 at 21:18, Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> On Mon, Oct 27, 2025 at 8:22 PM Japin Li <japinli(at)hotmail(dot)com> wrote:
>>
>>
>> Hi, Ajin
>>
>> Thanks for updating the patch.
>>
>> On Mon, 27 Oct 2025 at 18:47, Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
>> > On Fri, Oct 24, 2025 at 8:29 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>> >>
>> >> On Wed, Oct 22, 2025 at 10:25 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
>> >> >
>> >> >
>> >> > I've modified the comments to reflect the new changes.
>> >> >
>> >> > attaching patch v18 with the above changes.
>> >> >
>> >>
>> >> Thanks for the patch. The test is still not clear. Can we please add
>> >> the test after the test of "Test logical failover slots corresponding
>> >> to different plugins" finishes instead of adding it in between?
>> >>
>> >
>> > I've rewritten the tests again to make this possible. Attaching v19
>> > which has the modified tap test.
>>
>> Here are some comments on the new patch.
>>
>> 1. Given the existence of the foreach_ptr macro, we can switch the usage of
>> foreach to foreach_ptr.
>>
>> diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
>> index 1b78ffc5ff1..5db51407a82 100644
>> --- a/src/backend/replication/logical/slotsync.c
>> +++ b/src/backend/replication/logical/slotsync.c
>> @@ -872,7 +872,6 @@ fetch_remote_slots(WalReceiverConn *wrconn, List *slot_names)
>>
>> if (slot_names != NIL)
>> {
>> - ListCell *lc;
>> bool first_slot = true;
>>
>> /*
>> @@ -880,10 +879,8 @@ fetch_remote_slots(WalReceiverConn *wrconn, List *slot_names)
>> */
>> appendStringInfoString(&query, " AND slot_name IN (");
>>
>> - foreach(lc, slot_names)
>> + foreach_ptr(char, slot_name, slot_names)
>> {
>> - char *slot_name = (char *) lfirst(lc);
>> -
>> if (!first_slot)
>> appendStringInfoString(&query, ", ");
>>
>> @@ -1872,15 +1869,13 @@ static List *
>> extract_slot_names(List *remote_slots)
>> {
>> List *slot_names = NIL;
>> - ListCell *lc;
>> MemoryContext oldcontext;
>>
>> /* Switch to long-lived TopMemoryContext to store slot names */
>> oldcontext = MemoryContextSwitchTo(TopMemoryContext);
>>
>> - foreach(lc, remote_slots)
>> + foreach_ptr(RemoteSlot, remote_slot, remote_slots)
>> {
>> - RemoteSlot *remote_slot = (RemoteSlot *) lfirst(lc);
>> char *slot_name;
>>
>> slot_name = pstrdup(remote_slot->name);
>>
>> 2. To append a signal character, switch from appendStringInfoString() to the
>> more efficient appendStringInfoChar().
>>
>> + appendStringInfoString(&query, ")");
>>
>> 3. The query memory can be released immediately after walrcv_exec() because
>> there are no subsequent references.
>>
>> @@ -895,6 +892,7 @@ fetch_remote_slots(WalReceiverConn *wrconn, List *slot_names)
>>
>> /* Execute the query */
>> res = walrcv_exec(wrconn, query.data, SLOTSYNC_COLUMN_COUNT, slotRow);
>> + pfree(query.data);
>> if (res->status != WALRCV_OK_TUPLES)
>> ereport(ERROR,
>> errmsg("could not fetch failover logical slots info from the primary server: %s",
>> @@ -975,7 +973,6 @@ fetch_remote_slots(WalReceiverConn *wrconn, List *slot_names)
>> }
>>
>> walrcv_clear_result(res);
>> - pfree(query.data);
>>
>> return remote_slot_list;
>> }
>>
>
> Thanks for your review, Japin. Here's patch v20 addressing the comments.
>

Thanks for updating the patch. Here are some comments on v20.

1. Since the content is unchanged, no modification is needed here.

- * We do not drop the slot because the restart_lsn can be ahead of the
- * current location when recreating the slot in the next cycle. It may
- * take more time to create such a slot. Therefore, we keep this slot
- * and attempt the synchronization in the next cycle.
+ * We do not drop the slot because the restart_lsn can be
+ * ahead of the current location when recreating the slot in
+ * the next cycle. It may take more time to create such a
+ * slot. Therefore, we keep this slot and attempt the
+ * synchronization in the next cycle.

2. Could we align the parameter comment style for synchronize_slots() and
fetch_remote_slots() for better consistency?

3. Is this redundant? It was already initialized to false during declaration.

+ /* Reset flag before every iteration */
+ slot_persistence_pending = false;

4. A minor nitpick. The opening brace should be on a new line for style
consistency.

+ if (!IsTransactionState()) {
+ StartTransactionCommand();
+ started_tx = true;
+ }

5. Given that fparams.slot_names is a list, I suggest we replace NULL with NIL
for type consistency.

+ fparams.slot_names = NULL;

--
Regards,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-10-31 04:59:04 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Previous Message Amit Kapila 2025-10-31 04:40:52 Re: Logical Replication of sequences