| From: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
|---|---|
| To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Cc: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
| Subject: | Re: How can end users know the cause of LR slot sync delays? |
| Date: | 2025-11-21 10:11:44 |
| Message-ID: | CANhcyEUiY4ENuoi7kZSsLJFLn6yA_-oPCKrek=BaMfFfY3=P1w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, 21 Nov 2025 at 11:00, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Fri, Nov 21, 2025 at 9:58 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Nov 21, 2025 at 8:52 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Nov 18, 2025 at 4:07 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> > > >
> > > > On Fri, 14 Nov 2025 at 14:13, Hayato Kuroda (Fujitsu)
> > > > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > > > >
> > > > > Dear Shlok,
> > > > >
> > > > > Thanks for updating the patch. Few more comments.
> > > > >
> > > > > > > > > I’m not sure if this has already been discussed; I couldn’t find any
> > > > > > > > > mention of it in the thread. Why don’t we persist
> > > > > > > > > 'slot_sync_skip_reason' (it is outside of
> > > > > > > > > ReplicationSlotPersistentData)? If a slot wasn’t synced during the
> > > > > > > > > last cycle and the server restarts, it would be helpful to know the
> > > > > > > > > reason it wasn’t synced prior to the node restart.
> > > > > > > > >
> > > > > > Actually I did not think in this direction. I think it will be useful
> > > > > > to persist 'slot_sync_skip_reason'. I have made the change for the
> > > > > > same in the latest patch.
> > > > >
> > > > > Hmm, I'm wondering it should be written on the disk. Other attributes on the disk
> > > > > are essential to decode or replicate changes correctly, but sync status is not
> > > > > used for the purpose. Personally considered, slot sync would re-start soon after
> > > > > the reboot so that it is OK to start with empty. How about others?
> > > > >
> > > > > If we want to serialize the info, we should do further tasks:
> > > > > - update SLOT_VERSION
> > > > > - make the slot dirty then SaveSlotToPath() when the status is updated.
> > > > >
> > > > I agree with your point. Slot synchronization will restart shortly
> > > > after a reboot, so it seems reasonable to begin with an empty state
> > > > rather than persisting slot_sync_skip_reason.
> > > > For now, I’ve updated the patch so that slot_sync_skip_reason is no
> > > > longer persisted; its initialization is kept outside of
> > > > ReplicationSlotPersistentData. I’d also like to hear what others
> > > > think.
> > > >
> > >
> > > Users may even use an API to synchronize the slots rather than
> > > slotsync worker. In that case synchronization won't start immediately
> > > after server-restart.
> > >
> >
> > But I think after restart in most cases, the slot will be created
> > fresh as we persist the slot for the first time only when sync is
> > successful. Now, when the standby has not flushed the WAL
> > corresponding to remote_lsn (SS_SKIP_WAL_NOT_FLUSHED), the slotsync
> > can be skipped even for persisted slots but that should be rare and we
> > anyway won't be able to persist the slot_skip reason in other cases as
> > slot itself won't be persisted by that time. So, I feel keeping the
> > slot_sync_skip_reason in memory is sufficient.
> >
>
> Okay, makes sense.
>
> A few comments on 001:
>
> 1)
> + slots, but may (if leftover from a promotedstandby) contain a
> timestamp.
> promotedstandby --> promoted standby
>
> 2)
> + s.slotsync_skip_count,
> + s.last_slotsync_skip_at,
>
> Shall we rename last_slotsync_skip_at to slotsync_last_skip_at. That
> way all slotsync related stats columns will have same prefix.
>
> 3)
> +#include "utils/injection_point.h"
>
> + INJECTION_POINT("slot-sync-skip", NULL);
>
> I think we can move both to patch 002 as these are needed for test alone.
>
I have merged the test patch in the main patch and split the patch as
per Amit's suggestion in [1].
So this change will not be required.
> 4)
> +
> + /*
> + * If found_consistent_snapshot is not NULL, a true value means
> + * the slot synchronization was successful, while a false value
> + * means it was skipped (see
> + * update_and_persist_local_synced_slot()). If
> + * found_consistent_snapshot is NULL, no such check exists, so the
> + * stats can be updated directly.
> + */
> + if (!found_consistent_snapshot || *found_consistent_snapshot)
> + update_slotsync_skip_stats(SS_SKIP_NONE);
>
> I see that when 'found_consistent_snapshot' is true we update stats
> here but when it is false, we update stats in the caller. Also for
> 'remote_slot_precedes' case, we update stats to SS_SKIP_REMOTE_BEHIND
> here itself. I think for 'SS_SKIP_NO_CONSISTENT_SNAPSHOT' as well, we
> should update stats here instead of caller.
>
> We can do this:
> update_local_synced_slot()
> {
> skip_reason = none;
>
> if (remote is behind)
> skip_reason = SS_SKIP_REMOTE_BEHIND;
>
> if (found_consistent_snapshot && (*found_consistent_snapshot == false))
> skip_reason = SS_SKIP_NO_CONSISTENT_SNAPSHOT;
>
> --Later in this function, when syncing is done:
> update_slotsync_skip_stats(skip_reason)
> }
>
I agree with your suggestion that we should update the stats for case
of 'SS_SKIP_NO_CONSISTENT_SNAPSHOT' in update_local_synced_slot
instead of update_and_persist_local_synced_slot. I have made the
change for the same.
For your second suggestion of using the 'skip_reason' variable instead.
For the 'if (remote is behind)' we are returning inside this if
condition itself. So for this case we need to call function
update_slotsync_skip_stats directly. But for case of
'found_consistent_snapshot' we can optimise it.
> 5)
> + if (synced)
> + {
> + ReplicationSlotAcquire(NameStr(slot->data.name), true, false);
> +
> + if (slot->data.invalidated == RS_INVAL_NONE)
> + update_slotsync_skip_stats(SS_SKIP_WAL_NOT_FLUSHED);
> +
> + ReplicationSlotRelease();
> + }
>
> Shall we check 'slot->data.invalidated' along with 'synced'
> condition. That way, no need to acquire or release the slot if it is
> invalidated. We can fetch 'invalidated' under the same SpinLock
> itself.
>
Also I think to check 'slot->data.invalidated' we need to acquire the
slot due a similar race condition can occur as described in comments
below:
* The slot has been synchronized before.
*
* It is important to acquire the slot here before checking
* invalidation. If we don't acquire the slot first, there could be a
* race condition that the local slot could be invalidated just after
* checking the 'invalidated' flag here and we could end up
* overwriting 'invalidated' flag to remote_slot's value. See
* InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
* if the slot is not acquired by other processes.
*
I thought about it and I agree with your suggestion in [2] to add a
new reason in slotsync_skip_reason to indicate the slot skip is
happening due to an invalidated slot.
I think it is cleaner and would avoid confusion for users. I made the
changes for the same in the latest patch.
> 6)
> + SS_SKIP_WAL_NOT_FLUSHED, /* Standby did not flush the wal coresponding
> + * to confirmed flush on remote slot */
>
> on --> of
> coresponding --> corresponding
>
I have also addressed the remaining comments. I have attached the
updated v9 patches.
[1]: https://www.postgresql.org/message-id/CAA4eK1JM%2BroQMyXekvwJprMMaK_-HL%2Bn5twinZQ8fufnDEU28g%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAJpy0uC9WsJhc-qeFmz_JTPjW1vH3Zm%2BzS6jX0PKTY6vtEp38w%40mail.gmail.com
Thanks,
Shlok Kyal
| Attachment | Content-Type | Size |
|---|---|---|
| v9-0001-Add-slotsync-skip-statistics.patch | application/octet-stream | 22.7 KB |
| v9-0002-Add-slotsync_skip_reason-to-pg_replication_slots.patch | application/octet-stream | 16.2 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Shlok Kyal | 2025-11-21 10:12:48 | Re: How can end users know the cause of LR slot sync delays? |
| Previous Message | Chao Li | 2025-11-21 10:03:41 | Re: quoteOneName() inconsistency with quote_all_identifiers — replacement API proposed |