| From: | Alexander Kuzmenkov <akuzmenkov(at)tigerdata(dot)com> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | "pg_ctl stop" stops working after a backend crash |
| Date: | 2026-03-06 19:35:32 |
| Message-ID: | CALzhyqxD2_0HGn4zz5agovrGF5hsGebUw8SUK+JdTpXxd9nsyQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi hackers,
I noticed that sometimes, when I'm running the regression tests and a
backend crashes, the postmaster can get stuck in some weird state
where it doesn't terminate and doesn't respond to `pg_ctl stop`
anymore. I can semi-reliably reproduce this on 18.3 using a simple
script below.
```
set -e
pg_ctl start -w
psql -X -d postgres -c "SELECT pg_sleep(60)" &>/dev/null &
sleep 0.3
VICTIM=$(psql -X -d postgres -tAc \
"SELECT pid FROM pg_stat_activity WHERE query LIKE '%pg_sleep%' LIMIT 1")
kill -9 "$VICTIM" # trigger crash recovery
sleep 0.01 # let postmaster start reinitializing
timeout 8 pg_ctl stop -m fast &
STOP_PID=$!
sleep 6
if kill -0 "$STOP_PID" 2>/dev/null; then
echo "pg_ctl stop froze. Active processes:"
pgrep -al postgres
exit 1
fi
echo "Successful shutdown"
```
A typical output of this is:
598814 postgres
598864 postgres: io worker 0
598865 postgres: io worker 1
598866 postgres: io worker 2
598868 postgres: checkpointer
These processes just stay there indefinitely, and the shutdown
finishes if I do `pkill -USR2 postgres`.
The not working pg_ctl looks like a bug, so I though I should ask for
your comment on this.
Best regards
Alexander Kuzmenkov
TigerData
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Lukas Fittl | 2026-03-06 19:47:10 | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Previous Message | Masahiko Sawada | 2026-03-06 19:33:47 | Re: Trivial Fix: use palloc_array/repalloc_array for BufFile file arrays |