| From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
|---|---|
| To: | Andrey Silitskiy <a(dot)silitskiy(at)postgrespro(dot)ru> |
| Cc: | Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Ronan Dunklau <ronan(at)dunklau(dot)fr>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "peter(dot)eisentraut(at)enterprisedb(dot)com" <peter(dot)eisentraut(at)enterprisedb(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com> |
| Subject: | Re: Exit walsender before confirming remote flush in logical replication |
| Date: | 2026-04-05 02:24:08 |
| Message-ID: | CAHGQGwHScfvDU742KEn9e3A6uJH3FEAnQVOKc_y--7MTyJKKQw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Apr 3, 2026 at 6:37 PM Andrey Silitskiy
<a(dot)silitskiy(at)postgrespro(dot)ru> wrote:
>
> On Apr 3, 2026 Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> > I've updated the patch based on my earlier comments and made some
> > cosmetic improvements.
>
> Thanks for comments! I also just made the same changes after your review.
> Attaching an updated patch with fixes after your review, some of last
> cosmetic changes and two excluded test cases.
>
> There are currently 5 test cases: two for 0ms(empty and full buffers), two
> for 10ms (empty and full buffers), and one for the standby case. I think
> it is ok to exclude two cases with 0ms to speed up the launch of tests.
Agreed.
I removed those two tests and restructured the walsender shutdown tests.
Previously, each test was defined as a function and executed with different
settings (0ms and 10ms). Since the 0ms variants were dropped, I simplified
the structure and inlined the tests for better readability.
For the test case where both physical and logical replication run with
slotsync enabled, the test used SIGSTOP to stop the walreceiver. However,
sending SIGSTOP via kill is not available on Windows, so this test is now
skipped on Windows.
An updated patch is attached. I'm thinking to commit this before the feature
freeze. If there are any remaining minor issues, we can continue to refine
them afterward.
> > This sentence doesn't seem necessary, as similar GUCs don't mention this
>
> I decided to leave in the documentation a mention of the possibility of
> setting a parameter per replication connection. I think some users are not
> aware of this possibility and this short sentence might give them the idea.
OK. So, how about expanding the description to make it clearer for users?
I updated the description as follows in the updated patch.
+ This parameter can be set in <varname>primary_conninfo</varname> and
+ in the <literal>CONNECTION</literal> clause of
+ <command>CREATE SUBSCRIPTION</command> (for example, include
+ <literal>options=-cwal_sender_shutdown_timeout=10s</literal> in the
+ connection string), allowing different timeouts per replication
+ connection. For example, when both physical and logical replication
+ are used, it can be disabled for physical replication (e.g., for
+ switchovers) while enabled for logical replication to limit shutdown
+ time.
> > It would also be better to move "replication may be incomplete" to
> errdetail(),
> > and clarify it, for example, "Walsender is terminated before all WAL
> data was
> > replicated to the receiver".
>
> In some cases, replication may be fully completed even when exiting with
> WalSndDoneImmediate, so I edited the message.
Understood. Per the Error Message Style Guide [1], I changed "may" to "might"
in the detail message:
+ errdetail("Walsender process might have been terminated before all
WAL data was replicated to the receiver.")));
Regards,
[1] https://www.postgresql.org/docs/devel/error-style-guide.html#ERROR-STYLE-GUIDE-TRICKY-WORDS
--
Fujii Masao
| Attachment | Content-Type | Size |
|---|---|---|
| v10-0001-Add-wal_sender_shutdown_timeout-GUC-to-limit-shu.patch | application/octet-stream | 23.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alexander Korotkov | 2026-04-05 02:24:48 | Re: Asynchronous MergeAppend |
| Previous Message | Bruce Momjian | 2026-04-05 02:12:57 | Re: PG 19 release notes and authors |