Quick Links

Re: Exit walsender before confirming remote flush in logical replication

From:	Andrey Silitskiy <a(dot)silitskiy(at)postgrespro(dot)ru>
To:	"Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "peter(dot)eisentraut(at)enterprisedb(dot)com" <peter(dot)eisentraut(at)enterprisedb(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject:	Re: Exit walsender before confirming remote flush in logical replication
Date:	2026-01-17 15:46:24
Message-ID:	e820c5d5-a95e-4785-bba1-3806fc062a64@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Dear Hayato,
Thanks for your comments! Updated the patch.

On Jan 15, 2026 at 11:48 AM Hayato Kuroda
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> I think we can just use stop() because it internally runs `pg_ctl stop`
> and that command waits till the wait is finished by default. I feel it
> is dangerous to determine timeout to 5sec because the test can work on
> very poor environment.

ok_with_timeout was added because it allows to output a more reasonable
log in case of a problem from this thread: "Failed test 'Successful fast
shutdown of server with empty output buffers (timed out after 5 seconds)'"
instead of the usual "pg_ctl stop failed". But now I noticed that the
standard timeout is triggered earlier, and when setting a timeout in this
function greater than the standard PGCTLTIMEOUT, only "pg_ctl stop failed"
will be written. Perhaps it is reasonable to remove these functions.

> Also, not sure, how can we ensure the buffer is full here? Also, even
> if we have the way to check, the size may be quite platform depending.
> I think it may be better to test both streaming and logical replication
> instead of testing empty/full output buffer. Thought?

Initially, a second test case was added to show that previous patches
did not fix the problem of hanging in case of full buffers. I agree that
it may depend on the platform, but I can't think of a way to guarantee this,
even though the test case seems useful for checking the new mode.
Test contains only the case of logical replication, since so far I'm not
sure how to reproduce guaranteed flush delay on a physical replica in the
test. Any ideas?

Regards,
Andrey Silitskiy

Attachment	Content-Type	Size
v3-0001-Introduce-a-new-GUC-wal_sender_shutdown_mode.patch	text/x-patch	13.4 KB

In response to

RE: Exit walsender before confirming remote flush in logical replication at 2026-01-15 09:48:35 from Hayato Kuroda (Fujitsu)

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	cca5507	2026-01-17 15:56:17	Re: [BUG] Incorrect historic snapshot may be serialized to disk during fast-forwarding
Previous Message	Dmitry Dolgov	2026-01-17 15:26:02	Re: File locks for data directory lockfile in the context of Linux namespaces