Re: pg_stat_replication.*_lag sometimes shows NULL during active replication

From: Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
Date: 2026-03-16 00:25:52
Message-ID: CAOzEurSDzFsRXjofhq7mbNgoL8HaVbNeEhWBm7m9_K2ZNQnaBw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 13, 2026 at 12:27 AM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:

> Thanks for testing and for the clarification! You're right.
>
> However, if we apply this change, the time required for the lag information to
> be reset would effectively double. I start wondering if that's really
> acceptable, especially for back branches. Although the docs doesn't clearly
> specify this timing, doubling it could affect systems that monitor
> replication lag, for example. It might still be reasonable to apply
> such a change in master, though.

Yes, I agree. Doubling the lag reset time should be avoided in back
branches if possible.

> On further thought, the root cause seems to be that walreceiver can send
> two consecutive status reply messages with identical WAL locations even
> when wal_receiver_status_interval has not yet elapsed. Addressing that
> behavior directly might resolve the issue you reported. I've attached a PoC
> patch that does this. Thought?

Thank you for the v4 patch. I think this approach is better than mine.
I tested the patch and confirmed that the issue no longer reproduces
with physical replication. However, with logical replication, the lag
columns in pg_stat_replication still show NULL periodically at
wal_receiver_status_interval, since send_feedback() in worker.c can
still send duplicate positions.

+ * previsou update, i.e., when 'replyApply' is true.

One minor thing: there is a typo "previsou". It should be "previous".

--
Best regards,
Shinya Kato
NTT OSS Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2026-03-16 00:49:17 Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Previous Message Jelte Fennema-Nio 2026-03-15 23:57:17 Re: Change copyObject() to use typeof_unqual