Re: Fix lag columns in pg_stat_replication not advancing when replay LSN stalls

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Fix lag columns in pg_stat_replication not advancing when replay LSN stalls
Date: 2025-10-17 14:28:17
Message-ID: CAHGQGwEyzPoB+9eRRABp7oKFX12ACdU0oifz7oexmuJpQuMxTQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 17, 2025 at 5:11 PM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
> It took me some time to understand this fix. My most confusing was that once overwrite happens, how a reader head to catch up again? Finally I figured it out:
>
> ```
> + lag_tracker->read_heads[head] =
> + (lag_tracker->write_head + 1) % LAG_TRACKER_BUFFER_SIZE;
> ```
>
> "(lag_tracker->write_head + 1) % LAG_TRACKER_BUFFER_SIZE” points to the oldest LSN in the ring, from where an overflowed reader head starts to catch up.
>
> I have no comment on the code change. Nice patch!

Thanks for the review!

I've updated the source comment to make the code easier to understand.
The updated patch is attached.

> All I wonder is if we can add a TAP test for this fix?

I think it would be good to add a test for this fix, but reproducing
the condition
where the buffer fills up and the slowest read entry overflows takes a time.
Because of that, I'm not sure adding such a potentially slow test is a
good idea.

Regards,

--
Fujii Masao

Attachment Content-Type Size
v2-0001-Fix-stalled-lag-columns-in-pg_stat_replication-wh.patch application/octet-stream 4.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-10-17 14:37:22 Re: Minor spelling fix in memnodes.h
Previous Message Daniele Varrazzo 2025-10-17 14:26:44 Getting the SQLSTATE after a failed connection