Re: REPACK and naming

From: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: REPACK and naming
Date: 2025-09-17 15:03:42
Message-ID: 202509171453.m4j5lj2irran@alvherre.pgsql
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2025-Sep-17, Tom Lane wrote:

> I'm not at all in love with documenting VACUUM FULL and CLUSTER as
> being fundamentally the same thing. I think that is an implementation
> happenstance that could go away as easily as it appeared. Even if you
> think we'll never again rewrite it for heap, what of other table AMs?
> The underlying reality could be totally different for them.

So there two operations here. One is
REPACK tab USING INDEX idx
which we currently call CLUSTER, and there is also
REPACK TAB
(no index specified) which we currently call VACUUM FULL. These have
the very specific charter of rewriting the table while removing bloat,
the distinction being that they keep the rows ordered according to the
index or not. Both these operations currently use the same
implementation, yes; but if we were to reimplement one of them to use
some completely different piece of code, then the new command name
continues to work, it just calls the new different implementation, while
the other command continues to call the other one. (Or maybe we decide
reimplement both using different techniques, and we throw away
cluster.c, but still the command names continue to be sensible and would
continue to work.)

Thinking about the other half of your argument: if we add new table AMs
for which the cluster.c implementation doesn't work, then we'll have to
wire the table AM support routines to call some different implementation
into REPACK or REPACK USING INDEX. This is no different than if we keep
these commands being VACUUM FULL or CLUSTER; we would still need a
different implementation underneath, and we would still need to wire the
table AM support routines to call that different implementation.

So all things considered, I'm not seeing what aspect of the renaming
exactly are you uncomfortable with. We're not making the situation any
worse.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
<Schwern> It does it in a really, really complicated way
<crab> why does it need to be complicated?
<Schwern> Because it's MakeMaker.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Timur Magomedov 2025-09-17 15:15:00 Re: [WIP]Vertical Clustered Index (columnar store extension) - take2
Previous Message Robert Haas 2025-09-17 14:59:55 Re: REPACK and naming