From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Antonin Houska <ah(at)cybertec(dot)at> |
Cc: | Robert Treat <rob(at)xzilla(dot)net>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com> |
Subject: | Re: Adding REPACK [concurrently] |
Date: | 2025-08-30 17:50:56 |
Message-ID: | 202508301750.cbohxyy2pcce@alvherre.pgsql |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
Here's v19 of this patchset. This is mostly Antonin's v18. I added a
preparatory v19-0001 commit, which splits vacuumdb.c to create a new
file, vacuuming.c (and its header file vacuuming.h). If you look at it
under 'git show --color-moved=zebra' you should notice that most of it
is just code movement; there's hardly any code changes.
v19-0002 has absorbed Antonin's v18-0005 (the pg_repackdb binary)
together with the introduction of the REPACK command proper; but instead
of using a symlink, I just created a separate pg_repackdb.c source file
for it and we compile that small new source file with vacuuming.c to
create a regular binary. BTW the meson.build changes look somewhat
duplicative; maybe there's a less dumb way to go about this. (For
instance, maybe just have libscripts.a include vacuuming.o, though it's
not used by any of the other programs in that subdir.)
I'm not wedded to the name "vacuuming.c"; happy to take suggestions.
After 0002, the pg_repackdb utility should be ready to take clusterdb's
place, and also vacuumdb --full, with one gotcha: if you try to use
pg_repackdb with an older server version, it will fail, claiming that
REPACK is not supported. This is not ideal. Instead, we should make it
run VACUUM FULL (or CLUSTER); so if you have a fleet including older
servers you can use the new utils there too.
All the logic for vacuumdb to select tables to operate on has been moved
to vacuuming.c verbatim. This means this logic applies to pg_repackdb
as well. As long as you stick to repacking a single table this is okay
(read: it won't be used at all), but if you want to use parallel mode
(say to process multiple schemas), we might need to change it. For the
same reason, I think we should add an option to it (--index[=indexname])
to select whether to use the USING INDEX clause or not, and optionally
indicate which index to use; right now there's no way to select which
logic (cluster's or vacuum full's) to use.
Then v19-0003 through v19-0005 are Antonin's subsequent patches to add
the CONCURRENTLY option; I have not reviewed these at all, so I'm
including them here just for completion. I also included v18-0006 as
posted by Mihail previously, though I have little faith that we're going
to include it in this release.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Pensar que el espectro que vemos es ilusorio no lo despoja de espanto,
sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)
Attachment | Content-Type | Size |
---|---|---|
v19-0001-Split-vacuumdb-to-create-vacuuming.c-h.patch | text/x-diff | 69.3 KB |
v19-0002-Add-REPACK-command.patch | text/x-diff | 133.3 KB |
v19-0003-Refactor-index_concurrently_create_copy-for-use-.patch | text/x-diff | 4.1 KB |
v19-0004-Move-conversion-of-a-historic-to-MVCC-snapshot-t.patch | text/x-diff | 5.4 KB |
v19-0005-Add-CONCURRENTLY-option-to-REPACK-command.patch | text/x-diff | 147.4 KB |
v19-0006-Preserve-visibility-information-of-the-concurren.patch | text/x-diff | 56.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Jones | 2025-08-30 18:14:29 | Re: COPY TO: provide hint when WHERE clause is used |
Previous Message | Arseniy Mukhin | 2025-08-30 17:33:09 | Move block_range_read_stream_cb batchmode comment |