Re: Adding REPACK [concurrently]

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Subject: Re: Adding REPACK [concurrently]
Date: 2026-01-30 19:33:56
Message-ID: 57210.1769801636@localhost
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com> wrote:

> > PROC_IN_VACUUM shouldn't be used for the same reason StartupDecodingContext()
> > avoids setting PROC_IN_LOGICAL_DECODING in transaction. I've removed that and
> > the tests work for me. Especially the "cache lookup failed" error is almost
> > certainly related. Please let me know if you still get the other errors
>
> Yes, now it is passing.
>
> > (Except for 2, which is probably due to the MVCC-unsafe behavior, as discussed
> > earlier.)
>
> Not happening too. BTW, it was non MVCC-related, because in that case relcheckxmin would catch it.
>
> What if:
>
> 1) add new PROC_IN_REPACK flag
> 2) use it in catalog horizon, but not in data (like was done in [0] for PROC_IN_SAFE_IC)
>
> And after we have options:
> 3) do not "table_close(NewHeap, NoLock);" - keep ShareUpdateExclusiveLock all the time to prevent VACUUM enter
> 4) do not heap_page_prune_opt in repack transaction (just using simple flag)
> Or
> 3) preserve xmin/xmax of original transaction in repacked data
> 4) but better to keep ShareUpdateExclusiveLock anyway

I've been thinking of another approach. Note that REPACK creates a new table
only to eventually swap the relation files and drop it. Thus the transactions
needs to get XID assigned very soon.

I'm considering a special kind of relation whose catalog entries remain in the
catalog cache and are never written to the catalog tables. (Unlike temporary
relation, it'd be WAL logged so that REPACK can be replayed on standby.)

If we eventually implement the MVCC safety, XID will neither be needed during
data copying. And it shouldn't even be needed to build indexes, as long as
their catalog entries are also "cache only". Thus the transaction REPACK is
running in would not need XID until the data has been copied, indexes built
and even (most of) the concurrent data changes replayed. Only the final
catalog changes would require XID, but those should take very short time.

Without XID and with the snapshot resetting, REPACK should not really block
the progress of the VACUUM xmin horizon.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-01-30 20:03:27 Re: More speedups for tuple deformation
Previous Message surya poondla 2026-01-30 19:27:58 Re: log_min_messages per backend type