Re: Exit walsender before confirming remote flush in logical replication

From: Andrey Silitskiy <a(dot)silitskiy(at)postgrespro(dot)ru>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "peter(dot)eisentraut(at)enterprisedb(dot)com" <peter(dot)eisentraut(at)enterprisedb(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, 'Peter Smith' <smithpb2250(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>
Subject: Re: Exit walsender before confirming remote flush in logical replication
Date: 2025-11-18 10:32:01
Message-ID: 0fe5288d-5b4a-4aab-ae0e-c5a2fca0ee33@postgrespro.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear pgsql-hackers,

I am also interested in solving this problem, so I suggest a patch which
is based on Hayato's work shared earlier.

The problem we are solving is that the logical walsender processes currently
do not allow postgres to shut down until receiver side confirms the flush of
all data. In case of logical replication, this is not necessary. This
can lead
to an undesirable shutdown delay if, for example, apply worker is
waiting for
any locks to be released.

I agree with the opinion that the default behavior of the system should
not be
changed, as some clients may rely on the current behavior. But instead of
the START_REPLICATION parameter I propose a GUC parameter on the sender that
controls the walsender shutdown mode for all logical walsenders.the First,
the START_REPLICATION parameter places responsibility for choosing the
sender’s
shutdown semantics on the receiver side. Second, per-subscriber settings
do not
solve the problematic operational case where many walsenders exist: if
even one
of N walsender processes remains configured non-immediate, the publisher can
still be blocked. In other words, setting immediate for most subscribers but
missing one does not fix the global inability to shut down.

I also attach a tap test that reproduces the apply-worker's waiting for the
release of lock and the successful shutdown of publisher in immediate
walsender
shutdown mode.

Best Regards,
Andrey

Attachment Content-Type Size
0001-Introduce-a-new-GUC-logical_wal_sender_shutdown_mode.patch text/x-patch 11.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Vaibhav Dalvi 2025-11-18 10:34:22 Re: Non-text mode for pg_dumpall
Previous Message shveta malik 2025-11-18 10:09:46 Re: Proposal: Conflict log history table for Logical Replication