| From: | Shinya Kato <shinya11(dot)kato(at)gmail(dot)com> |
|---|---|
| To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: pg_stat_replication.*_lag sometimes shows NULL during active replication |
| Date: | 2026-03-16 00:25:52 |
| Message-ID: | CAOzEurSDzFsRXjofhq7mbNgoL8HaVbNeEhWBm7m9_K2ZNQnaBw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Mar 13, 2026 at 12:27 AM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> Thanks for testing and for the clarification! You're right.
>
> However, if we apply this change, the time required for the lag information to
> be reset would effectively double. I start wondering if that's really
> acceptable, especially for back branches. Although the docs doesn't clearly
> specify this timing, doubling it could affect systems that monitor
> replication lag, for example. It might still be reasonable to apply
> such a change in master, though.
Yes, I agree. Doubling the lag reset time should be avoided in back
branches if possible.
> On further thought, the root cause seems to be that walreceiver can send
> two consecutive status reply messages with identical WAL locations even
> when wal_receiver_status_interval has not yet elapsed. Addressing that
> behavior directly might resolve the issue you reported. I've attached a PoC
> patch that does this. Thought?
Thank you for the v4 patch. I think this approach is better than mine.
I tested the patch and confirmed that the issue no longer reproduces
with physical replication. However, with logical replication, the lag
columns in pg_stat_replication still show NULL periodically at
wal_receiver_status_interval, since send_feedback() in worker.c can
still send duplicate positions.
+ * previsou update, i.e., when 'replyApply' is true.
One minor thing: there is a typo "previsou". It should be "previous".
--
Best regards,
Shinya Kato
NTT OSS Center
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John Naylor | 2026-03-16 00:49:17 | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Previous Message | Jelte Fennema-Nio | 2026-03-15 23:57:17 | Re: Change copyObject() to use typeof_unqual |