| From: | Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com> |
|---|---|
| To: | Antonin Houska <ah(at)cybertec(dot)at> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net> |
| Subject: | Re: Adding REPACK [concurrently] |
| Date: | 2026-04-23 11:43:18 |
| Message-ID: | CADzfLwV9mT-dNcKF=isjx-nE4CPo0k+UL3sgvqzrCjbemZUJDg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello!
On Mon, Apr 20, 2026 at 7:44 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> When another process tries to get the lock after that, it checks this field,
> and if it's set, it checks for deadlocks even if deadlock timeout hasn't
> expired yet. If the process is already in the lock's queue and sleeping, the
> lock upgrading process wakes it up so it checks the flag immediately.
In any way there is no sense in waking up other backends just to force
them to cancel themselves.
Better to go in the [0] way - and just cancel another backend if we
are repack and found cycle in the deadlock detector.
We currently have all required infrastructure, even as the comment
described the possibility [1]:
> * Get this process out of wait state. (Note: we could do this more
> * efficiently by relying on lockAwaited, but use this coding to
> * preserve the flexibility to kill some other transaction than the
> * one detecting the deadlock.)
At the same time version [2] (with FutureWaitLock, explained in more
detail in [3]) is correct and cancels other backends without pointless
waiting, but it feels too complicated due to the complexity of
supporting the FastLock path.
Also, here's one simple idea inspired by your version.
What about adding a new field "do not try to upgrade that lock" to the
LOCK structure? If some backends try to (it has some lockmode and
tries to upgrade it) and it does not has flag 'deadlock_protected'
from [0] - just fail fast with "future deadlock detected".
That way [0] ensures correctness and "do not try to upgrade that lock"
just optimization to fail quickly for the most common scenarios.
Some 2+ backend loops might still wait a long time to be cancelled
(which is solved by [2], but complicated) - but I think this is a
super rare case we can ignore (because repack is still protected from
deadlock).
Mikhail.
[0]: https://www.postgresql.org/message-id/flat/CADzfLwURKVNQ%2B%2BDpi7bjoGfj-8pchDQEVex3eWBx0NCYn6TbDQ%40mail.gmail.com#bceb18354aa20c130a94b1deedd76fb7.
[1]: https://github.com/postgres/postgres/blob/master/src/backend/storage/lmgr/proc.c#L1870-L1873
[2]: https://www.postgresql.org/message-id/flat/CADzfLwUSnGnkfLwCWHQ%3DVVuAY1YTo%2B0Lr7pb%2BOPWUZbcYKSRUw%40mail.gmail.com#e7ba203b7402de332dfb58ce5329a73a
[3]: https://www.postgresql.org/message-id/flat/CADzfLwVf-3mjMwSTOcj9djNzGd-UjBOYbFjxgXRhtKuH_4rajA%40mail.gmail.com#b69f9c1d1d3d1cbf7f6d66be790894c4
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | shveta malik | 2026-04-23 11:15:11 | Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE |