| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
| Subject: | Re: Fix pg_stat_wal_receiver to show CONNECTING status |
| Date: | 2026-05-22 02:06:33 |
| Message-ID: | 93526C6D-DE0A-4B7D-B908-366735FC211D@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On May 21, 2026, at 20:29, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
>
>
>> On May 21, 2026, at 20:08, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>>
>> On Thu, May 21, 2026 at 03:20:13PM +0800, Chao Li wrote:
>>> I spent more time here, and found that it is still possible to leak
>>> conninfo in the WAL receiver reuse path:
>>>
>>> * WalRcvWaitForStartPosition() sets the state to WALRCV_WAITING.
>>> * Then RequestXLogStreaming() copies raw conninfo into
>>> * walrcv->conninfo and sets the state to WALRCV_RESTARTING.
>>> * WalRcvWaitForStartPosition() then moves the state to
>>> * WALRCV_CONNECTING, but this path does not clear walrcv->conninfo
>>> * again.
>>>
>>> The attached nocfbot_test.diff demonstrates the leak.
>>
>> File is missing, but I get it. This is a legit bug from what I can
>> see, that also affects all the stable branches, not only HEAD.
>>
>>> Initially I thought we could also set ready_to_display to false when
>>> setting the state to WALRCV_WAITING in WalRcvWaitForStartPosition(),
>>> and set it back to true when switching back to
>>> WALRCV_CONNECTING. However, that would make the WALRCV_WAITING and
>>> WALRCV_RESTARTING states invisible in pg_stat_wal_receiver.
>>
>> Nah, we should not do that. We want to track the waiting and
>> restarting states in the view.
>>
>>> I ended up with a solution that copies the primary connection info
>>> to walrcv->conninfo only when RequestXLogStreaming() is switching to
>>> WALRCV_STARTING. In the WALRCV_WAITING reuse path, the WAL receiver
>>> keeps using the existing wrconn, so it does not need raw conninfo to
>>> be copied into shared memory again. See the attached
>>> nocfbot_walreceiverfuncs.c.diff.
>>
>> Ah, yeah. This solution to this problem makes sense. We should not
>> clobber conninfo either in this case, or we'd lose the
>> user-displayable string returned by walrcv_get_conninfo() (conninfo
>> cannot be NULL based on the in-core callers of RequestXLogStreaming()
>> AFAIK, but who knows for things out there). As mentioned above, this
>> is a different issue than the visibility of the connection information
>> while we are connecting, and it should be backpatched. Would you like
>> to send a patch?
>> --
>> Michael
>
> Sorry for missing the attachments. Please take a look first. It’s late here, I can spend more time tomorrow.
>
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>
>
>
>
> <nocfbot_test.diff><nocfbot_walreceiverfuncs.c.diff>
Here comes the patch set:
* v3-0001 is the exactly same as v2-0001
* In v3-0002, the change in walreceiverfuncs.c is the same as the previous diff, and I tuned the test change a little bit.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| Attachment | Content-Type | Size |
|---|---|---|
| v3-0001-Improve-pg_stat_wal_receiver-for-CONNECTING-statu.patch | application/octet-stream | 3.4 KB |
| v3-0002-Avoid-exposing-raw-WAL-receiver-conninfo-during-t.patch | application/octet-stream | 3.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Smith | 2026-05-22 02:26:35 | Re: Support EXCEPT for TABLES IN SCHEMA publications |
| Previous Message | Fujii Masao | 2026-05-22 01:10:46 | Re: Prevent setting NO INHERIT on partitioned not-null constraints |