| From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Andrey Silitskiy <a(dot)silitskiy(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Ronan Dunklau <ronan(at)dunklau(dot)fr>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "peter(dot)eisentraut(at)enterprisedb(dot)com" <peter(dot)eisentraut(at)enterprisedb(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com> |
| Subject: | Re: Exit walsender before confirming remote flush in logical replication |
| Date: | 2026-04-07 05:39:07 |
| Message-ID: | CAHGQGwGoZos=7G5eRUs3JyFqYhCNLuZMmDmxS-cWjS0R56Jvcg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Apr 7, 2026 at 12:32 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> Failed on CI just now:
>
> https://cirrus-ci.com/task/6745359004729344?logs=test_world#L410
> https://api.cirrus-ci.com/v1/artifact/task/6745359004729344/testrun/build/testrun/subscription/038_walsnd_shutdown_timeout/log/regress_log_038_walsnd_shutdown_timeout
>
> [14:58:26.146](0.066s) ok 3 - have walreceiver pid 13796
> ### Stopping node "publisher" using mode fast
> # Running: pg_ctl --pgdata /home/postgres/postgres/build/testrun/subscription/038_walsnd_shutdown_timeout/data/t_038_walsnd_shutdown_timeout_publisher_data/pgdata --mode fast stop
> waiting for server to shut down........................................................................................................................... failed
> pg_ctl: server does not shut down
> # pg_ctl stop failed: 256
> # Postmaster PID for node "publisher" is 3679
> [15:00:38.178](132.032s) Bail out! pg_ctl stop failed
Thanks for reporting this!
From the CI results [1], the failure in 038_walsnd_shutdown_timeout.pl appears
to occur intermittently on FreeBSD. The failing case tests that, when both
physical and logical replication are in use with slotsync enabled and both are
stalled (walreceiver on the standby and the logical apply worker on
the subscriber are blocked), shutting down the primary completes due to
wal_sender_shutdown_timeout.
On FreeBSD, however, it seems that after the shutdown request, the physical
walsender can occasionally keep running, preventing shutdown from completing.
As a result, pg_ctl stop times out and the test fails.
I’ll investigate the cause. If it takes time to identify, I may temporarily
disable just this test case so it doesn’t block other development and testing,
then re-enable it once the issue is fixed.
Regards,
[1]
https://cirrus-ci.com/build/5134823678803968
https://cirrus-ci.com/build/5735329598013440
https://cirrus-ci.com/build/5917696627310592
https://cirrus-ci.com/build/5742460250357760
--
Fujii Masao
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Smith | 2026-04-07 05:46:51 | Re: DOCS: typo on CLUSTER page |
| Previous Message | Peter Smith | 2026-04-07 05:35:08 | Re: DOCS: typo on CLUSTER page |