| From: | Michael Paquier <michael(at)paquier(dot)xyz> |
|---|---|
| To: | pgsql-committers(at)lists(dot)postgresql(dot)org |
| Subject: | pgsql: Fix procLatch ownership race in ProcKill() |
| Date: | 2026-05-27 08:20:29 |
| Message-ID: | E1wS9V7-001Jgw-1y@gemulon.postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-committers |
Fix procLatch ownership race in ProcKill()
DisownLatch() was executed after the PGPROC entry of the process
terminated is pushed back into a freelist. A newly-forked backend that
recycles the slot could call OwnLatch() and PANIC with a "latch already
owned by PID", taking down the server.
There were two scenarios related to lock groups where this issue could
be reached:
* A follower pushes the leader's PGPROC back to the freelist while the
leader has not yet called DisownLatch() in its own ProcKill().
* A leader outliving all its followers pushes its own PGPROC onto the
freelist before reaching DisownLatch(), which would be the most common
scenario.
This issue is fixed by calling SwitchBackToLocalLatch() and
DisownLatch() at an earlier phase of ProcKill(), before any freelist
manipulation happens, so that the slot of the backend terminated is
never exposed as owning a latch.
Note that pgstat_reset_wait_event_storage() is kept at a later stage.
An upcoming commit will take advantage of that by introducing a test
able to check the original PANIC scenario.
Author: Vlad Lesin <vladlesin(at)gmail(dot)com>
Reviewed-by: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Reviewed-by: Michael Paquier <michael(at)paquier(dot)xyz>
Discussion: https://postgr.es/m/d2983796-2603-41b7-a66e-fc8489ddb954@gmail.com
Backpatch-through: 14
Branch
------
REL_17_STABLE
Details
-------
https://git.postgresql.org/pg/commitdiff/e14b4ea4a6ece9d89a9cb5cf40472664f26f322f
Modified Files
--------------
src/backend/storage/lmgr/proc.c | 30 +++++++++++++++++++-----------
1 file changed, 19 insertions(+), 11 deletions(-)
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Heikki Linnakangas | 2026-05-27 09:06:36 | pgsql: Fix self-deadlock when replaying WAL generated by older minor ve |
| Previous Message | Michael Paquier | 2026-05-27 05:53:10 | pgsql: Fix race conditions in ProcKill()'s lock-group freelist handling |