Re: Add WALRCV_CONNECTING state to walreceiver

From: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
To: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add WALRCV_CONNECTING state to walreceiver
Date: 2026-01-22 10:02:54
Message-ID: CABPTF7V2GFKwzeHSFKLQpdu9bU1qBh30X1rxgygSKdVhZnwTcw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Chao,

Thanks for looking into this.

>
> Hi Xuneng,
>
> I just reviewed the patch. It splits the current STREAMING state into CONNECTING and STREAMING, where CONNECTING represents the connection establishing phase, which makes sense. The patch is solid, I have just a few small comments:
>
> 1 - 0001
> ```
> + <para>
> + <literal>stopped</literal>: WAL receiver process is not running.
> + This state is normally not visible because the view returns no row
> + when the WAL receiver is not running.
> + </para>
> ```
>
> “Is not running” appears twice in this short paragraph, looks redundant. Suggestion:
> ```
> <literal>stopped</literal>: WAL receiver process is not running.
> This state is normally not visible because the view returns no row in this case.
> ```
>
> 2 - 0002
> ```
> + WALRCV_CONNECTING, /* connecting to primary, replication not yet
> + * started */
> ```
>
> I think “primary” is inaccurate. A connection destination could be a primary, or the other standby, so maybe change “primary” to “upstream server”.

Yeah, "upstream server" is more precise. There are multiple
user-facing log messages in this file that also use “primary”. I’m
wondering whether we should update those as well.

>
>
> 3 - 0002
> ```
> * Mark walreceiver as running in shared memory.
> ```
>
> I think we should update this comment, changing “running” to “connecting”. There is not a “running” state, before this patch, “streaming” can roughly equal to “running”, after this patch, “running” is kinda unclear.
>

It has been changed.

> 4 - 0002
> ```
> + /* Connection established, switch to streaming state */
> + SpinLockAcquire(&walrcv->mutex);
> + Assert(walrcv->walRcvState == WALRCV_CONNECTING);
> + walrcv->walRcvState = WALRCV_STREAMING;
> + SpinLockRelease(&walrcv->mutex);
> ```
>
> I don’t feel good with this Assert(). Looking at ShutdownWalRcv(), the startup process might set the state to STOPPING, so there is a race.
>
> Before this patch, this is no state change here, thus rest logic can handle STOPPING. After this patch, if the race occurs, in dev mode, the Assert is fired; in production mode, STOPPING is overwritten by STREAMING, which is wrong.
>
> So, instead of Assert(), I think we should check if current state is CONNECTING, if not, it should not proceed.

It makes sense to me. Please check the updated patch.

--
Best,
Xuneng

Attachment Content-Type Size
v6-0001-Add-WALRCV_CONNECTING-state-to-walreceiver.patch application/octet-stream 6.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Xuneng Zhou 2026-01-22 10:06:21 Re: Add WALRCV_CONNECTING state to walreceiver
Previous Message Bertrand Drouvot 2026-01-22 09:52:41 Re: Flush some statistics within running transactions