Re: pg_stat_replication.*_lag sometimes shows NULL during active replication

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
Date: 2026-03-10 01:54:13
Message-ID: CAHGQGwEmMBBAE0RG-R3_LacfT4fbB55qGE6n9O5mNwrqvbNBtw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 10, 2026 at 10:02 AM Shinya Kato <shinya11(dot)kato(at)gmail(dot)com> wrote:
>
> On Mon, Mar 9, 2026 at 8:21 PM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> > > The attached v2 patch takes a different approach: it additionally
> > > requires that all reported positions (write/flush/apply) remain
> > > unchanged from the previous reply. This directly detects a truly idle
> > > system without relying on timeouts—if any position has advanced, new
> > > WAL activity must have occurred, so we should not clear the lag values
> > > even if the lag tracker is empty.
> >
> > This approach looks good to me.
>
> Thank you for looking into this.
>
> > One comment: currently, the lag becomes NULL basically after about one
> > wal_receiver_status_interval during periods of no activity. OTOH, with this
> > approach, it seems it would take about twice wal_receiver_status_interval.
> > Is this understanding correct?
>
> Exactly. With this patch, it takes about two
> wal_receiver_status_interval cycles to show NULL instead of one. I
> think this is an acceptable trade-off because it is better to take a
> bit longer to detect inactivity than to incorrectly show NULL during
> active replication.

Even with your latest patch, if we remove fullyAppliedLastTime, and set
clearLagTimes to true when applyPtr == sentPtr && noLagSamples &&
positionsUnchanged,
wouldn't the time for the lag to become NULL be almost the same as
wal_receiver_status_interval?

The documentation doesn't clearly specify how long it should take for
the lag to become NULL, so doubling that time might be acceptable.
However, if we can keep it roughly the same without much complexity,
I think that would be preferable.

Thought?

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Manni Wood 2026-03-10 02:30:21 Re: Speed up COPY FROM text/CSV parsing using SIMD
Previous Message Chao Li 2026-03-10 01:42:17 Re: client_connection_check_interval default value