Re: pg_stat_replication.*_lag sometimes shows NULL during active replication

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Shinya Kato <shinya11(dot)kato(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
Date: 2026-03-19 17:13:25
Message-ID: CAHGQGwGLUXmjC1+A1fzg-ynP1pdKC-0yfmLYcnnu4YJSEDnuQw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 19, 2026 at 10:58 PM Shinya Kato <shinya11(dot)kato(at)gmail(dot)com> wrote:
>
> On Tue, Mar 17, 2026 at 11:00 AM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> >
> > On Mon, Mar 16, 2026 at 9:26 AM Shinya Kato <shinya11(dot)kato(at)gmail(dot)com> wrote:
> > > Thank you for the v4 patch. I think this approach is better than mine.
> > > I tested the patch and confirmed that the issue no longer reproduces
> > > with physical replication. However, with logical replication, the lag
> > > columns in pg_stat_replication still show NULL periodically at
> > > wal_receiver_status_interval, since send_feedback() in worker.c can
> > > still send duplicate positions.
> >
> > I was thinking that if a feedback message triggered by
> > wal_receiver_status_interval has the same LSNs as the previous message,
> > it's expected for the lag columns to become NULL. But you see it differently,
> > don't you? Sorry, I failed to understand your point...
>
> Sorry for the confusion. I ran a script inserting one row every 0.5
> seconds under logical replication and confirmed that NULL still
> appears in the lag columns even while replication is actively running.
> I was initially mistaken that this was tied to
> wal_receiver_status_interval timing — that turned out to be unrelated.
>
> I haven't had time to investigate further, but my current impression
> is that the existing approach may not be sufficient for logical
> replication.

Thanks for the clarification! I understand your point now.

I think the issue occurs when the positions in the first message point to
the same LSN (e.g., 0/030D5230), and the second message reports the same but
larger LSN (e.g., 0/030D52E0).

I've updated the patch to address this. It removes fullyAppliedLastTime,
tracks the positions from the previous reply, and clears the lag values only
when the positions remain unchanged across two consecutive messages.

Patch attached. Could you test and review this updated patch?

Regards,

--
Fujii Masao

Attachment Content-Type Size
v5-0001-Avoid-sending-duplicate-WAL-locations-in-standby-.patch application/octet-stream 10.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2026-03-19 17:17:04 Re: pg_plan_advice
Previous Message Tom Lane 2026-03-19 17:08:15 Re: [PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails