Re: Adding REPACK [concurrently]

From: Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Subject: Re: Adding REPACK [concurrently]
Date: 2026-02-01 19:46:31
Message-ID: CADzfLwUukiGOPoUkDgf6oEB-Y0TnNy6UFUN4obnU-AN5W1N=sw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello!

PART 1:

--------------

Something still wrong with 0006, check:

'pgbench: error: client 12 script 0 aborted in command 2 query 0: ERROR:
attempted to overwrite invisible tuple
https://cirrus-ci.com/task/6385612527239168?logs=test_world#L300

But it is hard to reproduce - happened once.

--------------

Also, once I got
[16:25:18.641] # at /tmp/cirrus-ci-build/contrib/amcheck/t/
007_repack_concurrently.pl line 57.
[16:25:18.641] # 'pgbench: error: client 6 script 0
aborted in command 2 query 0: ERROR: relation 21856 deleted while still in
use
https://cirrus-ci.com/task/4686014242881536?logs=test_world#L384

It was the PROC_IN_REPACK version (see below), but I think it is not
related to it. But I'm not 100% sure.

PART 2:

> I'm considering a special kind of relation whose catalog entries remain
in the
> catalog cache and are never written to the catalog tables. (Unlike
temporary
> relation, it'd be WAL logged so that REPACK can be replayed on standby.)

I think it is too complicated, especially including replication logic.
Approach with catalog-only xid is much simpler, it was even committed (yes,
reverted but because of another reason).

Essentially we have two issues:
1) make sure catalog entities are not dropped because the vacuum
2) make sure data in new table is not vacuumed also

For the first PROC_IN_REPACK is enough.
For second - depends if MVCC-safe (original xmin/xmax) are preserved. If
yes - looks like nothing more needed.

If not - just prevent the vacuum from touching the table (but, looks like
it is done already, because lock is held on NewHeap until commit).
And additionally reset snapshots during the index building itself, but it
is scope of another patch.

I have implemented PROC_IN_REPACK POC in the attached patch.

Also, I am still not sure if MVCC-safe implementation is worth
its complexity compared with "relcheckxmin"approach [0].

[0]:
https://www.postgresql.org/message-id/CADzfLwUEH5%2BLjCN%2B6kRfSsXwuou8rKXyVV42Wi-O_TG0360Kug%40mail.gmail.com

Best regards,
Mikhail.

Attachment Content-Type Size
vX-0001-Handle-VACUUM-interaction-with-REPACK-CONCURRENTL.patch application/octet-stream 6.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Silitskiy 2026-02-01 19:52:36 Re: Exit walsender before confirming remote flush in logical replication
Previous Message Tom Lane 2026-02-01 18:20:27 Re: Decoupling our alignment assumptions about int64 and double