| From: | Shinya Kato <shinya11(dot)kato(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | pg_stat_replication.*_lag sometimes shows NULL during active replication |
| Date: | 2026-02-24 06:53:54 |
| Message-ID: | CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi hackers,
I have noticed that pg_stat_replication.*_lag sometimes shows NULL
when inserting a record per second for health checking. This happens
when the startup process replays WAL fast enough before the
walreceiver sends its flush notification to the walsender.
Here is the sequence that triggers the issue: (See normal.svg and
error.svg for diagrams of the normal and problematic cases.)
1. The walreceiver receives, writes, and flushes WAL, then wakes the
startup process via WakeupRecovery().
2. The startup process replays all available WAL quickly, then calls
WalRcvForceReply() to set force_reply = true and wakes the
walreceiver.
3. The walreceiver sends a flush notification to the walsender
(XLogWalRcvSendReply() in XLogWalRcvFlush()). Since the startup has
already replayed the WAL by this point, this message reports the
incremented applyPtr, which equals sentPtr. The walsender processes
this message, consuming the LagTracker samples and setting
fullyAppliedLastTime = true.
4. In the next loop iteration, the walreceiver sees force_reply = true
and sends another reply with the same positions. The walsender sees
applyPtr == sentPtr for the second consecutive time and sets
clearLagTimes = true. Since the LagTracker samples were already
consumed by step 3, all lag values are -1. With clearLagTimes = true,
these -1 values are written to walsnd->*Lag, causing
pg_stat_replication to show NULL.
The comment in ProcessStandbyReplyMessage() says:
* If the standby reports that it has fully replayed the WAL in two
* consecutive reply messages, then the second such message must result
* from wal_receiver_status_interval expiring on the standby.
But as shown above, the second message can also come from
WalRcvForceReply(), violating this assumption.
The attached patch fixes this by adding a check that all lag values
are -1 to the clearLagTimes condition. This ensures that clearLagTimes
only triggers when there are truly no new lag samples in two
consecutive messages (i.e., the system is genuinely idle), and not
when the samples were simply consumed by a preceding message in a
burst of replies.
Regards,
--
Best regards,
Shinya Kato
NTT OSS Center
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Fix-pg_stat_replication.-_lag-showing-NULL-during.patch | application/octet-stream | 3.3 KB |
| normal.svg | image/svg+xml | 112.8 KB |
| error.svg | image/svg+xml | 112.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | jian he | 2026-02-24 07:21:50 | Re: Non-text mode for pg_dumpall |
| Previous Message | Chao Li | 2026-02-24 06:28:15 | Fix bug of clearing of waitStart in ProcWakeup() |