Re: Exit walsender before confirming remote flush in logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Exit walsender before confirming remote flush in logical replication
Date: 2023-02-02 05:51:54
Message-ID: CAA4eK1+pJh16OagyJ1KAuhB-5RExqF1yQw3svRW8oeg-aKCA-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 2, 2023 at 10:48 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Wed, Feb 1, 2023 at 6:28 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> >
> > > In a case where pq_is_send_pending() doesn't become false
> > > for a long time, (e.g., the network socket buffer got full due to the
> > > apply worker waiting on a lock), I think users should unblock it by
> > > themselves. Or it might be practically better to shutdown the
> > > subscriber first in the logical replication case, unlike the physical
> > > replication case.
> > >
> >
> > Yeah, will users like such a dependency? And what will they gain by doing so?
>
> IIUC there is no difference between smart shutdown and fast shutdown
> in logical replication walsender, but reading the doc[1], it seems to
> me that in the smart shutdown mode, the server stops existing sessions
> normally. For example, If the client is psql that gets stuck for some
> reason and the network buffer gets full, the smart shutdown waits for
> a backend process to send all results to the client. I think the
> logical replication walsender should follow this behavior for
> consistency. One idea is to distinguish smart shutdown and fast
> shutdown also in logical replication walsender so that we disconnect
> even without the done message in fast shutdown mode, but I'm not sure
> it's worthwhile.
>

The main problem we want to solve here is to avoid shutdown failing in
case walreceiver/applyworker is busy waiting for some lock or for some
other reason as shown in the email [1]. I haven't tested it but if
such a problem doesn't exist in smart shutdown mode then probably we
can allow walsender to wait till all the data is sent. We can once
investigate what it takes to introduce shutdown mode knowledge for
logical walsender. OTOH, the docs for smart shutdown says "If the
server is in hot standby, recovery and streaming replication will be
terminated once all clients have disconnected." which to me indicates
that it is okay to terminate logical replication connections even in
smart mode.

[1] - https://www.postgresql.org/message-id/TYAPR01MB58669CB06F6657ABCEFE6555F5F29%40TYAPR01MB5866.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-02-02 05:53:30 Re: Exit walsender before confirming remote flush in logical replication
Previous Message Michael Paquier 2023-02-02 05:34:17 Re: recovery modules