Re: Exit walsender before confirming remote flush in logical replication

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Andrey Silitskiy <a(dot)silitskiy(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Ronan Dunklau <ronan(at)dunklau(dot)fr>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "peter(dot)eisentraut(at)enterprisedb(dot)com" <peter(dot)eisentraut(at)enterprisedb(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>
Subject: Re: Exit walsender before confirming remote flush in logical replication
Date: 2026-04-23 04:51:53
Message-ID: CAHGQGwHwTM5bfM9H35zdxzG+800LR=o4mTvZi08vJ843+29GDg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 22, 2026 at 3:32 AM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> Therefore, since replacing pq_flush() with pq_flush_if_writable() seems to
> change behavior only in a limited and acceptable way, I'm thinking to create
> the patch doing that replacement.

On second thought, replacing pq_flush() with pq_flush_if_writable() is not
sufficient. EndCommand(), which WalSndDone() calls before pq_flush(), can also
block when the send buffer is full. That happens because EndCommand() uses
pq_putmessage() rather than pq_putmessage_noblock().

Also, replacing pq_flush() with pq_flush_if_writable() would cause walsender to
give up sending pending messages, including CommandComplete, even before
wal_sender_shutdown_timeout expires. That seems a bit odd. I think it is better
for walsender to continue honoring wal_sender_shutdown_timeout while attempting
to send the final CommandComplete message.

I've attached a patch that addresses both issues. For the first, it introduces
EndCommandExtended(), which allows CommandComplete to be queued with
pq_putmessage_noblock(). For the second, it updates WalSndDone() to use
ProcessPendingWrites() instead of pq_flush(), so the walsender write loop can
continue processing replies and checking replication and shutdown timeouts
while pending output is being flushed.

Thoughts?

Regards,

--
Fujii Masao

Attachment Content-Type Size
v1-0001-Avoid-blocking-indefinitely-while-finishing-walse.patch application/octet-stream 5.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2026-04-23 04:57:19 Re: Question about criteria for adding items to the v19 open items wiki page
Previous Message vignesh C 2026-04-23 04:50:08 Re: Skipping schema changes in publication