RE: Exit walsender before confirming remote flush in logical replication

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Andres Freund' <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Exit walsender before confirming remote flush in logical replication
Date: 2023-02-07 14:41:13
Message-ID: TYAPR01MB58661D0EDEEDDA284226D401F5DB9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Andres, Amit,

> On 2023-02-07 09:00:13 +0530, Amit Kapila wrote:
> > On Tue, Feb 7, 2023 at 2:04 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > How about we make it an option in START_REPLICATION? Delayed logical
> rep can
> > > toggle that on by default.
>
> > Works for me. So, when this option is set in START_REPLICATION
> > message, walsender will set some flag and allow itself to exit at
> > shutdown without waiting for WAL to be sent?
>
> Yes. I think that might be useful in other situations as well, but we don't
> need to make those configurable initially. But I imagine it'd be useful to set
> things up so that non-HA physical replicas don't delay shutdown, particularly
> if they're geographically far away.

Based on the discussion, I made a patch for adding a walsender option
exit_before_confirming to the START_STREAMING replication command. It can be
used for both physical and logical replication. I made the patch with
extendibility - it allows adding further options.
And better naming are very welcome.

For physical replication, the grammar was slightly changed like a logical one.
It can now accept options but currently, only one option is allowed. And it is
not used in normal streaming replication. For logical replication, the option is
combined with options for the output plugin. Of course, we can modify the API to
better style.

0001 patch was ported from time-delayed logical replication thread[1], which uses
the added option. When the min_apply_delay option is specified and publisher seems
to be PG16 or later, the apply worker sends a START_REPLICATION query with
exit_before_confirming = true. And the worker will reboot and send START_REPLICATION
again when min_apply_delay is changed from zero to a non-zero value or non-zero to zero.

Note that I removed version number because the approach is completely changed.

[1]: https://www.postgresql.org/message-id/TYCPR01MB8373BA483A6D2C924C600968EDDB9@TYCPR01MB8373.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
0001-Time-delayed-logical-replication-subscriber.patch application/octet-stream 75.9 KB
0002-Extend-START_REPLICATION-command-to-accept-walsender.patch application/octet-stream 13.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dag Lem 2023-02-07 14:47:43 Re: daitch_mokotoff module
Previous Message jacktby@gmail.com 2023-02-07 14:16:36 How to solve "too many Lwlocks taken"?