Re: Improve pg_sync_replication_slots() to wait for primary to advance

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance
Date: 2025-09-15 12:47:43
Message-ID: CAFPTHDbZz1tLfoBqRoJuFffknAtF4asfBQax+TLOnMV-6apcww@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 10, 2025 at 2:45 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Tue, Sep 9, 2025 at 5:37 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
> > Attached v11 patch addressing the above comments.
> >
>
> Please find a few comments:
>
> 1)
>
> + Retry is done after 2
> + * sec wait. Exits early if promotion is triggered or certain critical
>
> We can say: Retry is done after SLOTSYNC_API_NAPTIME_MS wait.
>

Changed.

> 2)
> + /*
> + * Fetch remote slot info for the given slot_names. If slot_names is NIL,
> + * fetch all failover-enabled slots. Note that we reuse slot_names from
> + * the previous iteration; re-fetching all failover slots each time could
> + * cause an endless loop.
> + */
>
> a)
> the previous iteration --> the first iteration.
>
> b) Also we can mention the reason why we take names from first
> iteration instead of going for pending ones alone, something like:
>
> Instead of reprocessing only the pending slots in each iteration, it's
> better to process all the slots received in the first iteration.
> This ensures that by the time we're done, all slots reflect the latest values.
>
> 3)
> + remote_slots = fetch_remote_slots(wrconn, slot_names);
> +
> +
> + /* Attempt to synchronize slots */
> + synchronize_slots(wrconn, remote_slots, &slot_persistence_pending);
>
> One extra blank line can be removed
>

Fixed.

> 4)
>
> + /* Clean up slot_names if allocated in TopMemoryContext */
> + if (slot_names)
> + list_free_deep(slot_names);
>
> Can we please move it before 'ReplicationSlotCleanup'.
>

Fixed.

> 5)
> In case of error in subsequent iteration, slot_names allocated from
> TopMemoryContext will be left unfreed?
>

I've changed the logic so that even on error, slot_names are freed.

> 6)
> + ListCell *lc;
> + bool first_slot = true;
>
> Shall we move these two to concerned if-block:
> if (slot_names != NIL)
>

Changed.

> 7)
> * The slot_persistence_pending flag is used by the pg_sync_replication_slots
> * API to track if any slots could not be persisted and need to be retried.
>
> a) Instead of mentioning only about slot_persistence_pending argument
> in concerned function's header, we shall define all the arguments.
>
> b) We can remove the 'flag' term from the comments as it is a
> function-argument now.
>

Changed.

> 8)
> I think we should add briefly in the header of the file about the new
> behaviour of API.
>

Added.

Attaching patch v12 addressing these comments.

regards,
Ajin Cherian
Fujitsu Australia

Attachment Content-Type Size
v12-0001-Improve-initial-slot-synchronization-in-pg_sync_.patch application/octet-stream 23.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2025-09-15 12:56:08 Remove custom redundant full page write description from GIN
Previous Message jian he 2025-09-15 12:40:36 Re: let ALTER TABLE DROP COLUMN drop whole-row referenced object