Re: CLUSTER on partitioned index

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, 李杰(慎追) <adger(dot)lj(at)alibaba-inc(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>
Subject: Re: CLUSTER on partitioned index
Date: 2021-09-12 20:10:29
Message-ID: 20210912201029.GG26465@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 20, 2021 at 08:27:02PM -0400, Alvaro Herrera wrote:
> I have to wonder if there really *is* a use case for CLUSTER in the
> first place on regular tables, let alone on partitioned tables, which
> are likely to be large and thus take a lot of time.

The cluster now is done one partition at a time, so it might take a long time,
but doesn't lock the whole partition heirarchy. Same as VACUUM (since v10) and
(since v14) REINDEX.

The patch series would be simpler if partitioned indexes weren't allowed to be
marked CLUSTERED ON. Then, "USING <index>" would be required, which is a step
forward from not supporting cluster on partitioned index at all. As attached.
It's arguably true that the follow-up patches supporting indisclustered on
partitioned indexes aren't worth the trouble.

For sure CLUSTER is useful, see eg.
https://github.com/bucardo/check_postgres/issues/29

It's sometimes important that the table is clustered to allow index scan to
work well (or be chosen at all).

If a table is scanned by an index, and isn't well-clustered, then a larger
fraction (multiple) of the table will be read than what's optimal. That
requires more IO, and more cache space.

A year ago, I partitioned one of our previously-unpartitioned tables, and ended
up clustering the partitions on their partition key (and indexed column) using
\gexec. This was preferable to doing INSERT .. SELECT .. ORDER BY, which
would've made the initial process slower - maybe unjustifiably slower for some
customers. Cluster (using \gexec) was something I was able to do afterward,
for completeness, since I expect the partitions to be mostly-clustered
automatically, so it was bothering me that the existing data was unordered, and
that it might behave differently in the future.

> What justifies spending so much time on this implementation?

Actually, I don't use partitioned indexes at all here, so this is not for us..
I worked on this after Adger asked about CIC on partitioned tables (for which I
have a patch in the queue). Isn't it worth supporting that (or should we
include an example about how to use format() with %I and \gexec) ?

VACUUM [FULL] has recursed into partitions since v10 (f0e44751d).
REINDEX supports partitioned tables in v14 (a6642b3ae).
Partitioned indexes exist since v11 (as you well know), so it's somewhat odd
that CLUSTER isn't supported, and seems increasingly weird as decreasing number
of DDL commands are not supported. Supporting DDL on partitioned tables
supports the idea that the physical partitions can be seen as an implementation
detail by the DBA, which I understand was the intent since v10.

You're right that I wouldn't plan to *routinely* re-cluster a partitioned
table. Rather, I'd cluster only its "recent" *partitions*, and leave the old
ones alone. Or cluster the partitions, a single time, once they're no longer
recent. I don't think the feature is marginal just because I don't use it
routinely.

> My impression is that CLUSTER is pretty much a fringe command nowadays,
> because of the access exclusive lock required.

A step forward would be to integrate something like pg_repack/reorg/squeeze.
I used pg_repack --index until v12 got REINDEX CONCURRENTLY. The goal there
was to improve index scans on some large, append-only partitions where the
planner gave an index scan, but performance was poor (now, we use BRIN so it
works well without reindex). I tested that this would still be an issue by
creating a non-brin index for a single day's table (even with v13 deduplication
and v12 TID tiebreak).

As I see it, support for partitioned cluster is orthogonal to an
online/concurrent cluster, which is a job for another patch.

--
Justin

Attachment Content-Type Size
v11-0001-Implement-CLUSTER-of-partitioned-table.patch text/x-diff 16.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2021-09-12 23:02:49 Re: Remove duplicate static function check_permissions in slotfuncs.c and logicalfuncs.c
Previous Message Dean Rasheed 2021-09-12 19:36:05 Re: Numeric x^y for negative x