From: | Ajin Cherian <itsajin(at)gmail(dot)com> |
---|---|
To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Cc: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Improve pg_sync_replication_slots() to wait for primary to advance |
Date: | 2025-09-15 12:47:43 |
Message-ID: | CAFPTHDbZz1tLfoBqRoJuFffknAtF4asfBQax+TLOnMV-6apcww@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Sep 10, 2025 at 2:45 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Tue, Sep 9, 2025 at 5:37 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
> > Attached v11 patch addressing the above comments.
> >
>
> Please find a few comments:
>
> 1)
>
> + Retry is done after 2
> + * sec wait. Exits early if promotion is triggered or certain critical
>
> We can say: Retry is done after SLOTSYNC_API_NAPTIME_MS wait.
>
Changed.
> 2)
> + /*
> + * Fetch remote slot info for the given slot_names. If slot_names is NIL,
> + * fetch all failover-enabled slots. Note that we reuse slot_names from
> + * the previous iteration; re-fetching all failover slots each time could
> + * cause an endless loop.
> + */
>
> a)
> the previous iteration --> the first iteration.
>
> b) Also we can mention the reason why we take names from first
> iteration instead of going for pending ones alone, something like:
>
> Instead of reprocessing only the pending slots in each iteration, it's
> better to process all the slots received in the first iteration.
> This ensures that by the time we're done, all slots reflect the latest values.
>
> 3)
> + remote_slots = fetch_remote_slots(wrconn, slot_names);
> +
> +
> + /* Attempt to synchronize slots */
> + synchronize_slots(wrconn, remote_slots, &slot_persistence_pending);
>
> One extra blank line can be removed
>
Fixed.
> 4)
>
> + /* Clean up slot_names if allocated in TopMemoryContext */
> + if (slot_names)
> + list_free_deep(slot_names);
>
> Can we please move it before 'ReplicationSlotCleanup'.
>
Fixed.
> 5)
> In case of error in subsequent iteration, slot_names allocated from
> TopMemoryContext will be left unfreed?
>
I've changed the logic so that even on error, slot_names are freed.
> 6)
> + ListCell *lc;
> + bool first_slot = true;
>
> Shall we move these two to concerned if-block:
> if (slot_names != NIL)
>
Changed.
> 7)
> * The slot_persistence_pending flag is used by the pg_sync_replication_slots
> * API to track if any slots could not be persisted and need to be retried.
>
> a) Instead of mentioning only about slot_persistence_pending argument
> in concerned function's header, we shall define all the arguments.
>
> b) We can remove the 'flag' term from the comments as it is a
> function-argument now.
>
Changed.
> 8)
> I think we should add briefly in the header of the file about the new
> behaviour of API.
>
Added.
Attaching patch v12 addressing these comments.
regards,
Ajin Cherian
Fujitsu Australia
Attachment | Content-Type | Size |
---|---|---|
v12-0001-Improve-initial-slot-synchronization-in-pg_sync_.patch | application/octet-stream | 23.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Kirill Reshke | 2025-09-15 12:56:08 | Remove custom redundant full page write description from GIN |
Previous Message | jian he | 2025-09-15 12:40:36 | Re: let ALTER TABLE DROP COLUMN drop whole-row referenced object |