Quick Links

Re: Adding REPACK [concurrently]

From:	Antonin Houska <ah(at)cybertec(dot)at>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Subject:	Re: Adding REPACK [concurrently]
Date:	2026-04-15 14:50:11
Message-ID:	25514.1776264611@localhost
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2026-04-14 15:37:56 +0200, Antonin Houska wrote:
> > Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > > On 2026-04-12 15:31:20 +0200, Mihail Nikalayeu wrote:
> > > > Instead of cancelling the backend entered the deadlock detector - it
> > > > cancel some another (nearest hard edge) until it is possible to get the
> > > > lock (either by
> > > > reordering or directly).
> > >
> > > I don't think that's as good. The problem is that that way you're only
> > > detecting the deadlocks once they have materialized (i.e. once repack actually
> > > does the lock upgrade), rather than cancelling when we know that the problem
> > > starts.
> >
> > This is my hack that tries to do that.
>
> I still think this needs to be in the deadlock detector. The lock cycle just
> needs to be a bit more complicated for a hack in JoinWaitQueue not to work.
> There's no guarantee that the wait that triggers the deadlock is actually on
> the relation being repacked.

ok, I see.

I thought of a "hypothetical graph", which would include the to-be-granted
lock, but the major issue is that it will not work correctly without the
locking the LMGR's LW locks we do in CheckDeadLock():

for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
LWLockAcquire(LockHashPartitionLockByIndex(i), LW_EXCLUSIVE);

And obviously, doing this each time we want to insert a lock into the queue
would be bad for performance. It's even mentioned in the storage/lmgr/README
that the current approach is optimistic, so I think that major rework would be
needed if we wanted to entirely avoid waiting that leads to deadlock.

The approach proposed by Mihail [1] seems the least problematic to me, and
something like that occurred to me when I thought about the problem the first
time. However, when we wake up the other processes in order to run the
deadlock detection, they should do that immediately. I've got no good idea
about implementation at the moment, since latch can be set for unrelated
reasons. (Besides that, I have some more questions about this patch, which I
can post separately.)

[1] https://www.postgresql.org/message-id/CADzfLwURKVNQ%2B%2BDpi7bjoGfj-8pchDQEVex3eWBx0NCYn6TbDQ%40mail.gmail.com

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Re: Adding REPACK [concurrently] at 2026-04-14 13:58:55 from Andres Freund

Responses

Re: Adding REPACK [concurrently] at 2026-04-15 14:59:51 from Andres Freund
Re: Adding REPACK [concurrently] at 2026-04-15 15:54:12 from Mihail Nikalayeu

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	jian he	2026-04-15 14:59:13	Re: First draft of PG 19 release notes
Previous Message	Tom Lane	2026-04-15 14:37:32	Re: Add errdetail() with PID and UID about source of termination signal