Re: REPACK and naming

From: Antonin Houska <ah(at)cybertec(dot)at>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: alvherre(at)alvh(dot)no-ip(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: REPACK and naming
Date: 2025-09-19 12:29:58
Message-ID: 13076.1758284998@localhost
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Fri, 19 Sept 2025 at 23:58, Antonin Houska <ah(at)cybertec(dot)at> wrote:
> > Admittedly I haven't thought about clause like ORDER BY yet, but I wonder if
> > it'd really be useful. My understanding is that the purpose of clustering is
> > to make index scan more efficient: with a clustered table, the heap tuples
> > pertaining to given index tuple should be located on the same page, so the
> > heap access is not that random.
>
> I imagine that's true most of the time, but it could also be so that
> fewer pages are dirtied when an UPDATE updates a set or rows with the
> same or similar clustered column values.

Good point.

> > If IOT-AM table does not have anything like index, I imagine it has some kind
> > of ordering information in the system catalog. Without that the query planner
> > can hardly utilize the ordering. In such case REPACK should use the catalog
> > information on ordering rather than accept arbitrary ORDER BY clause.
>
> Well, it would be impossible to insert records without some metadata
> to indicate the IOT keys...
>
> You might assume that someone might change their mind one day about
> the chosen order and wish to change it. My point was about leaving the
> door open to support that by having some native syntax that could be
> used to trigger that change.

I doubted whether the current AM API is designed to do catalog changes, but
then recalled that CLUSTER does set pg_index.indisclustered, and that it does
so outside table_relation_copy_for_cluster(). So I can now imagine that REPACK
... ORDER BY can do something like that.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Arseniy Mukhin 2025-09-19 12:34:21 Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Previous Message David Rowley 2025-09-19 12:19:47 Re: REPACK and naming