Re: tablecmds: reject CLUSTER ON for partitioned tables earlier

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: tablecmds: reject CLUSTER ON for partitioned tables earlier
Date: 2026-01-28 01:56:36
Message-ID: A1847A7B-7CB8-4A03-8CEA-FF7D3991DEDA@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jan 27, 2026, at 15:48, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Tue, Jan 27, 2026 at 07:13:04AM +0800, Chao Li wrote:
>> I added two new test cases in 0002 that trigger the check.
>>
>> BTW, this is the CF entry:
>> https://commitfest.postgresql.org/patch/6415/. You may mark yourself
>> as a reviewer, and once you consider the patch is ready to go, would
>> you mind change the status to Ready For Committer?
>
> There is more to this set of changes than it looks at first sight.
>
> Hence, one question about 0001: can the previous error path in
> mark_index_clustered() be reached through a different mean than ALTER
> TABLE? If yes, we should have a test for it. If no, it could be
> switched to an elog(ERROR) or an assertion. The code paths leading to
> the previous error should be analyzed further.

Okay, I spent time today investigating mark_index_clustered() today.

First, I reset the source tree to 05fb5d6 where the partitioned table check was added to mark_index_clustered(). The commit subject is "Ignore partitioned indexes where appropriate”. It added the check in 3 functions:

* cluster()
* mark_index_clustered()
* get_relation_info() - not in scope of this discussion

At this commit, ALTER TABLE … CLUSTER ON / SET WITHOUT CLUSTER code patch could reach mark_index_clustered(). Other than that, mark_index_clustered() was only called by rebuild_relation() when the parameter indexOid is valid; rebuild_relation() was only called by cluster_rel(); and cluster_rel() was called by vacuum_rel() and cluster().

* For the cluster() code patch: because of the check added to cluster() by this commit, partitioned table would return early, thus mark_index_clustered() was actually not called.
* For the vacuum_rel() code path: there was already a check for partitioned table to return early, thus cluster_rel() won’t be called against partitioned tables, so that mark_index_clustered() could not be called either.

So, looks like the check added to mark_index_clustered() was only for the ALTER TABLE code path.

Then, switching the source tree back to this patch. Now, for the ALTER TABLE code path, 0001 ensures partitioned table won’t reach mark_index_clustered().

Other than ALTER TABLE, mark_index_clustered() is only called by rebuild_relation(); rebuild_relation() is only called by cluster_rel(); cluster_rel() is called by vacuum_rel() and cluster(). So, the call paths are the same as commit 05fb5d6.

For the cluster() code path, I traced this scenario:
```
evantest=# create table p (id int) partition by range (id);
CREATE TABLE
evantest=# create table p1 partition of p for values from (0) to (10);
CREATE TABLE
evantest=# create index p_idx on p (id);
CREATE INDEX
evantest=# cluster p using p_idx;
CLUSTER
```
The code ran into cluster(). Now, cluster() is much complicated than it was in commit 05fb5d6. For a partitioned table, it iterates all leaf partitions to call cluster_rel():
```
/*
* Either we're processing a partitioned table, or we were not given any
* table name at all. In either case, obtain a list of relations to
* process.
*
* In the former case, an index name must have been given, so we don't
* need to recheck its "indisclustered" bit, but we have to check that it
* is an index that we can cluster on. In the latter case, we set the
* option bit to have indisclustered verified.
*
* Rechecking the relation itself is necessary here in all cases.
*/
params.options |= CLUOPT_RECHECK;
if (rel != NULL)
{
Assert(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
check_index_is_clusterable(rel, indexOid, AccessShareLock);
rtcs = get_tables_to_cluster_partitioned(cluster_context, indexOid);

/* close relation, releasing lock on parent table */
table_close(rel, AccessExclusiveLock);
}
….
cluster_multiple_rels(rtcs, &params); // cluster_rel() is called here
```
So, partitioned table should never reach mark_index_clustered() from the cluster() code patch.

For the vacuum_rel() code patch, same as before, partitioned table will return early, cluster_rel() won’t be called at all:
```
/*
* Silently ignore partitioned tables as there is no work to be done. The
* useful work is on their child partitions, which have been queued up for
* us separately.
*/
if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
{
relation_close(rel, lmode);
PopActiveSnapshot();
CommitTransactionCommand();
/* It's OK to proceed with ANALYZE on this table */
return true;
}
```

In summary, with 0001, the partitioned table check in mark_index_clustered() is no longer needed. But as mark_index_clustered() is an extern-ed function, it might have future callers, I think we can change ereport(ERROR) to an Assert(). I will include the change in next revision.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tender Wang 2026-01-28 01:59:00 Re: Optimize IS DISTINCT FROM with non-nullable inputs
Previous Message Corey Huinker 2026-01-28 00:59:25 Re: Extended Statistics set/restore/clear functions.