| From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
|---|---|
| To: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
| Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> |
| Subject: | Re: How can end users know the cause of LR slot sync delays? |
| Date: | 2025-11-24 05:49:23 |
| Message-ID: | CAA4eK1KcbdXufQD=p3VA=BsbF3x6UhD4o8qHCXhF4jY6A4NnDw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Nov 21, 2025 at 6:21 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
>
> The Cbot complained that it was not able to build the docs. I have
> fixed it and attached the latest patch.
>
Few comments on 0001:
1.
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>slotsync_last_skip_at</structfield><type>timestamp
with time zone</type>
+ </para>
+ <para>
+ Time at which last slot synchronization was skipped.
How important it is to use last in the above field? Isn't
slotsync_skip_at sufficient, as the description says all?
2.
+ /* If slot is present on the local, update the slot sync skip stats */
+ if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
+ {
+ bool synced;
+
+ SpinLockAcquire(&slot->mutex);
+ synced = slot->data.synced;
+ SpinLockRelease(&slot->mutex);
+
+ if (synced)
+ {
+ ReplicationSlotAcquire(NameStr(slot->data.name), true, false);
+
+ pgstat_report_replslotsync_skip(slot);
+
+ ReplicationSlotRelease();
+ }
+ }
I think acquiring the slot just for reporting the stats sounds too
much. How about instead moving the check if
(remote_slot->confirmed_lsn > latestFlushPtr) in the later if/else
blocks where we have the slot acquire acquired?
In if block, we can move it just before below check
if (slot->data.persistency == RS_TEMPORARY)
{
slot_updated = update_and_persist_local_synced_slot(remote_slot,
In the else block, just before following
update_and_persist_local_synced_slot(remote_slot, remote_dbid);
3.
We can only get here if pgstat_create_replslot() or
+ * pgstat_acquire_replslot() have already been called.
+ */
+void
+pgstat_report_replslotsync_skip(ReplicationSlot *slot)
3A. Instead of having such a comment, it is better to have elog(ERROR,
.. or Assertion if the stats entry doesn't exist by this time.
3B. Also, let's name the function as pgstat_report_replslotsync().
4. I think the current test used in the patch is very complex as it
requires multiple server restarts. Instead, we can use a test similar
to what is being used in the patch proposed in the thread [1].
--
With Regards,
Amit Kapila.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2025-11-24 05:53:56 | Re: How can end users know the cause of LR slot sync delays? |
| Previous Message | Michael Paquier | 2025-11-24 05:46:29 | Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache |