RE: Exit walsender before confirming remote flush in logical replication

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Exit walsender before confirming remote flush in logical replication
Date: 2023-02-03 12:08:48
Message-ID: TYAPR01MB58661F81B38AC7A43F44A81DF5D79@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Amit, Sawada-san,

> > IIUC there is no difference between smart shutdown and fast shutdown
> > in logical replication walsender, but reading the doc[1], it seems to
> > me that in the smart shutdown mode, the server stops existing sessions
> > normally. For example, If the client is psql that gets stuck for some
> > reason and the network buffer gets full, the smart shutdown waits for
> > a backend process to send all results to the client. I think the
> > logical replication walsender should follow this behavior for
> > consistency. One idea is to distinguish smart shutdown and fast
> > shutdown also in logical replication walsender so that we disconnect
> > even without the done message in fast shutdown mode, but I'm not sure
> > it's worthwhile.
> >
>
> The main problem we want to solve here is to avoid shutdown failing in
> case walreceiver/applyworker is busy waiting for some lock or for some
> other reason as shown in the email [1]. I haven't tested it but if
> such a problem doesn't exist in smart shutdown mode then probably we
> can allow walsender to wait till all the data is sent.

Based on the idea, I made a PoC patch to introduce the smart shutdown to walsenders.
PSA 0002 patch. 0001 is not changed from v5.
When logical walsenders got shutdown request but their send buffer is full due to
the delay, they will:

* wait to complete to send data to subscriber if we are in smart shutdown mode
* exit immediately if we are in fast shutdown mode

Note that in both case, walsender does not wait the remote flush of WALs.

For implementing that, I added new attribute to WalSndCtlData that indicates the
shutdown status. Basically it is zero, but it will be changed by postmaster when
it gets request.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v6-0001-Exit-walsender-before-confirming-remote-flush-in-.patch application/octet-stream 4.6 KB
v6-0002-Introduce-smart-shutdown-for-logical-walsender.patch application/octet-stream 4.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2023-02-03 13:01:09 Re: generic plans and "initial" pruning
Previous Message Amit Kapila 2023-02-03 11:52:27 Re: Time delayed LR (WAS Re: logical replication restrictions)