Re: Fix pg_stat_wal_receiver to show CONNECTING status

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Xuneng Zhou <xunengzhou(at)gmail(dot)com>
Subject: Re: Fix pg_stat_wal_receiver to show CONNECTING status
Date: 2026-05-20 01:47:40
Message-ID: 1F153E64-B791-42FA-A60A-64813B20B81E@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On May 19, 2026, at 21:55, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Tue, May 19, 2026 at 01:55:14PM +0800, Chao Li wrote:
>> I also tried restarting the standby server, and the result was the same.
>>
>> The problem is that pg_stat_wal_receiver is gated by
>> WalRcv->ready_to_display, and when the status is CONNECTING,
>> WalRcv->ready_to_display is false.
>
> Initially, I was thinking that the walrcv_connect() delay would not be
> that important to track in this context, but you are right that this
> stands for improvement before the release.
>
> @@ -1474,21 +1474,10 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
> - if (pid == 0 || !ready_to_display)
> + /* No WAL receiver, just return a tuple with NULL values */
> + if (pid == 0)
> PG_RETURN_NULL();
>
> This suggestion is making the SQL function call feebler, IMO,
> impacting the readability around ready_to_display that we want to act
> as a gate to the data provided in the view. This flag is important to
> check at an early state of the function call, and I don't really want
> to change that. A better thing to do would be to split into two steps
> how the WAL receiver data is filled between the walrcv_connect() call:
> 1) Before the call, reset all the connection-related fields because
> they are not relevant before the connection to the remote is
> completed, set ready_for_display to true to make the connecting state
> visible in the view. The connection information does not matter
> anyway here: we cannot be sure which point we are connected to until
> the connection is fully established.
> 2) After the call, fill in the connection-related fields.
>
> This means taking twice the WAL receiver spinlock instead of once,
> which is not going to matter in practice as the latency of the
> connection attempt is much larger than that.
>
> What do you think about the attached, then?
> --
> Michael
> <v2-0001-Improve-pg_stat_wal_receiver-for-CONNECTING-statu.patch>

Hi Micheal,

Thanks for your patch.

I just read v2, and it is actually the first solution I tried. The reason I gave up on that approach and switched to the implementation in v1 is that it may wrongly report last_msg_send_time, last_msg_receipt_time, and latest_end_time. See my test with v2:

```
evantest=# SELECT * FROM pg_stat_wal_receiver;
pid | status | receive_start_lsn | receive_start_tli | written_lsn | flushed_lsn | received_tli | last_msg_send_time | last_msg_receipt_time | latest_end_lsn | latest_end_time | slot_name | sender_host | sender_port | conninfo
-------+------------+-------------------+-------------------+-------------+-------------+--------------+-------------------------------+-------------------------------+----------------+-------------------------------+-----------+-------------+-------------+----------
83930 | connecting | 0/03000000 | 1 | 0/03000000 | 0/03000000 | 1 | 2026-05-20 09:24:09.121679+08 | 2026-05-20 09:24:09.121679+08 | | 2026-05-20 09:24:09.121679+08 | | | |
(1 row)

evantest=# \c
You are now connected to database "evantest" as user "chaol".
evantest=# SELECT * FROM pg_stat_wal_receiver;
pid | status | receive_start_lsn | receive_start_tli | written_lsn | flushed_lsn | received_tli | last_msg_send_time | last_msg_receipt_time | latest_end_lsn | latest_end_time | slot_name | sender_host | sender_port | conninfo
-------+------------+-------------------+-------------------+-------------+-------------+--------------+-------------------------------+-------------------------------+----------------+-------------------------------+-----------+-------------+-------------+----------
84709 | connecting | 0/03000000 | 1 | 0/03000000 | 0/03000000 | 1 | 2026-05-20 09:27:37.407117+08 | 2026-05-20 09:27:37.407117+08 | | 2026-05-20 09:27:37.407117+08 | | | |
(1 row)

evantest=# \c
You are now connected to database "evantest" as user "chaol".
evantest=# SELECT * FROM pg_stat_wal_receiver;
pid | status | receive_start_lsn | receive_start_tli | written_lsn | flushed_lsn | received_tli | last_msg_send_time | last_msg_receipt_time | latest_end_lsn | latest_end_time | slot_name | sender_host | sender_port | conninfo
-------+------------+-------------------+-------------------+-------------+-------------+--------------+-------------------------------+-------------------------------+----------------+-------------------------------+-----------+-------------+-------------+----------
84805 | connecting | 0/03000000 | 1 | 0/03000000 | 0/03000000 | 1 | 2026-05-20 09:28:03.251298+08 | 2026-05-20 09:28:03.251298+08 | | 2026-05-20 09:28:03.251298+08 | | | |
(1 row)
```

As shown above, every time I restarted the standby server, last_msg_send_time, last_msg_receipt_time, and latest_end_time were updated to the standby server start time. But in this test, the standby was connecting to a fake primary, so no WAL receiver message had been sent or received.

I tried to avoid more complicated changes, so I ended up with the v1 approach. I think it's okay to leave the other columns NULL while the receiver is still connecting, because at that point the only reliable information available is the receiver process's PID and status.

For v1, maybe we could clarify the meaning of ready_to_display with a comment. It seems to be intended to indicate that the connection-related information, such as LSNs and timestamps, is ready to display. In that sense, pid and status don't need to be gated by it.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2026-05-20 02:24:57 Re: [PATCH] Add contrib/anyarray: intarray-style operations and indexes for any array type
Previous Message Chao Li 2026-05-20 01:07:49 Re: Avoid leaking system path from pg_available_extensions