Re: REPACK and naming

From: Álvaro Herrera <alvherre(at)kurilemu(dot)de>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: REPACK and naming
Date: 2025-09-19 12:49:02
Message-ID: 202509191243.7o2i3qnbhjmb@alvherre.pgsql
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2025-Sep-19, Antonin Houska wrote:

> Admittedly I haven't thought about clause like ORDER BY yet, but I wonder if
> it'd really be useful. My understanding is that the purpose of clustering is
> to make index scan more efficient:

Not necessarily. For some queries in some workloads, having tuples in a
certain order for a seqscan might give considerable performance benefit
also. Moreso with, say, BRIN indexes, where having one tuple in one
page range or another could mean having to scan that page range or
eliding it completely.

> with a clustered table, the heap tuples
> pertaining to given index tuple should be located on the same page, so the
> heap access is not that random.

Yes, I suppose this is the first-order reason, and probably why we
currently only support basing clustering on an index. But I doubt it's
the only one. (It's also worth pointing out that quite possibly having
REPACK CONCURRENTLY is going to make clustering a lot more popular;
without concurrency, clustering is practically useless.)

> If IOT-AM table does not have anything like index, I imagine it has some kind
> of ordering information in the system catalog. Without that the query planner
> can hardly utilize the ordering.

Sure.

> In such case REPACK should use the catalog information on ordering
> rather than accept arbitrary ORDER BY clause.

... but, as David said, it might be valuable to change that ordering for
whatever reason.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2025-09-19 12:50:09 Re: [BUG] temporary file usage report with extended protocol and unnamed portals
Previous Message Álvaro Herrera 2025-09-19 12:42:58 Re: REPACK and naming