Re: Adding REPACK [concurrently]

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Subject: Re: Adding REPACK [concurrently]
Date: 2026-03-26 09:51:50
Message-ID: 23138.1774518710@localhost
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:

> On 2026-Mar-25, Srinath Reddy Sadipiralla wrote:
>
> > Then as suggested by Alvaro off-list I checked the lock upgrade
> > behavior during the table swap phase. I observed that if another
> > transaction holds a conflicting lock on the table when the swap is
> > attempted, it can lead to “transient table” data loss during a manual
> > or timeout abort. when a REPACK (concurrent) waits for a conflicting
> > lock to be released and eventually hits a lock_timeout (or is
> > cancelled via ctrl+c), the transaction aborts. During this abort, the
> > cleanup process triggers smgrDoPendingDeletes. This results in the
> > removal of all transient table relfiles and decoder worker files
> > created during the process. This effectively wipes out the work done
> > by the transient table creation before the swap could successfully
> > complete, this happens because during transient table creation we add
> > the table to the PendingRelDelete list.
>
> I think we certainly need to make the files be deleted in some
> reasonable fashion if repack fails partway through.

I think that Srinath tries to explain the cleanup in detail, but in fact
that's a normal processing of transaction abort. Not sure we need to do
anything special.

> As for lock upgrade, I wonder if the best way to handle this isn't to
> hack the deadlock detector so that it causes any *other* process to die,
> if they detect that they would block on REPACK. Arguably there's
> nothing that you can do to a table while its undergoing REPACK
> CONCURRENTLY; any alterations would have to wait until the repacking is
> compelted. We can implement that idea simply enough, as shown in this
> crude prototype. (I omitted the last three patches in the series, and
> squashed my proposed changes into 0003, as announced in my previous
> posting.)

I haven't thought of it because I'm not familiar with the deadlock detector,
but what you do seems consistent with the way blocking by autovacuum is
handled.

The only problem I noticed is that PROC_IN_CONCURRENT_REPACK is not cleared at
the end of transaction. Perhaps it should be added to PROC_VACUUM_STATE_MASK
(name of which is already misleading due to the presence of PROC_IN_SAFE_IC,
but that's another problem).

> The isolation test file is also a bit crude; I just copied repack.spec
> to a new file and removed the uninteresting bits.

Maybe just add a new permutation to repack.spec? I don't remembery if I
created repack_toast.spec as a separate file just for better readability or if
there was some other issue, but the deadlock test might fit into repack.spec.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexandre Felipe 2026-03-26 09:51:56 Re: SLOPE - Planner optimizations on monotonic expressions.
Previous Message SATYANARAYANA NARLAPURAM 2026-03-26 09:48:42 Re: Introduce XID age based replication slot invalidation