Re: logical apply worker's lock waits in subscriber can stall checkpointer in publisher

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: logical apply worker's lock waits in subscriber can stall checkpointer in publisher
Date: 2026-02-02 12:15:27
Message-ID: CAHGQGwE3uQveGEJN+1S9r_OzXhRdMpXD7y=-D516YtxWQeNp4A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 2, 2026 at 1:50 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Fujii-san,
>
> > Yeah, but I'd like to try the first option. Attached is a very WIP patch that
> > attempts to implement it.
> >
> > With this patch, when a walsender exits with >= FATAL,
> > send_message_to_frontend() attempts to send the error message to the standby
> > in non-blocking mode. If that fails, the walsender gives up on sending
> > the message and exits immediately.
>
> I'm still unclear it is OK to modify the fundamental code, but confirmed your
> patch can solve the issue.
>
> One concern for me is that the WALs might be more likely to be missed for
> streaming replication case. What if the case walreceiver is bit busy thus send
> buffer becomes full for a while?
> Are there no issues because switchover after the walsender exits with FATAL is
> not recommended?

I don't think this is problematic, since PostgreSQL has never guaranteed that
WAL data already in the send buffer will actually be delivered to the client
at walsender FATAL exit case. But do you see this differently?

Regards,

--
Fujii Masao

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Rahila Syed 2026-02-02 12:28:21 Re: Enhancing Memory Context Statistics Reporting
Previous Message Fujii Masao 2026-02-02 11:58:00 Re: Exit walsender before confirming remote flush in logical replication