Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

From: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Steve Singer <steve(at)ssinger(dot)info>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Jose Luis Tallon <jltallon(at)adv-solutions(dot)net>
Subject: Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly
Date: 2021-01-21 14:06:06
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2021-01-21 04:41, Michael Paquier wrote:
> On Wed, Jan 20, 2021 at 03:34:39PM -0300, Alvaro Herrera wrote:
>> On 2021-Jan-20, Alexey Kondratov wrote:
>>> Ugh, forgot to attach the patches. Here they are.
>> Yeah, looks reasonable.
>>> +
>>> + if (changed)
>>> + /* Record dependency on tablespace */
>>> + changeDependencyOnTablespace(RelationRelationId,
>>> + reloid, rd_rel->reltablespace);
>> Why have a separate "if (changed)" block here instead of merging with
>> the above?
> Yep.

Sure, this is a refactoring artifact.

> + if (SetRelTablespace(reloid, newTableSpace))
> + /* Make sure the reltablespace change is visible */
> + CommandCounterIncrement();
> At quick glance, I am wondering why you just don't do a CCI within
> SetRelTablespace().

I did it that way for a better readability at first, since it looks more
natural when you do some change (SetRelTablespace) and then make them
visible with CCI. Second argument was that in the case of
reindex_index() we have to also call RelationAssumeNewRelfilenode() and
RelationDropStorage() before doing CCI and making the new tablespace
visible. And this part is critical, I guess.

> + This specifies that indexes will be rebuilt on a new tablespace.
> + Cannot be used with "mapped" relations. If
> <literal>SCHEMA</literal>,
> + <literal>DATABASE</literal> or <literal>SYSTEM</literal> is
> specified, then
> + all unsuitable relations will be skipped and a single
> <literal>WARNING</literal>
> + will be generated.
> What is an unsuitable relation? How can the end user know that?

This was referring to mapped relations mentioned in the previous
sentence. I have tried to rewrite this part and make it more specific in
my current version. Also added Justin's changes to the docs and comment.

> This is missing ACL checks when moving the index into a new location,
> so this requires some pg_tablespace_aclcheck() calls, and the other
> patches share the same issue.

I added proper pg_tablespace_aclcheck()'s into the reindex_index() and

> + else if (partkind == RELKIND_PARTITIONED_TABLE)
> + {
> + Relation rel = table_open(partoid, ShareLock);
> + List *indexIds = RelationGetIndexList(rel);
> + ListCell *lc;
> +
> + table_close(rel, NoLock);
> + foreach (lc, indexIds)
> + {
> + Oid indexid = lfirst_oid(lc);
> + (void) set_rel_tablespace(indexid,
> params->tablespaceOid);
> + }
> + }
> This is really a good question. ReindexPartitions() would trigger one
> transaction per leaf to work on. Changing the tablespace of the
> partitioned table(s) before doing any work has the advantage to tell
> any new partition to use the new tablespace. Now, I see a struggling
> point here: what should we do if the processing fails in the middle of
> the move, leaving a portion of the leaves in the previous tablespace?
> On a follow-up reindex with the same command, should the command force
> a reindex even on the partitions that have been moved? Or could there
> be a point in skipping the partitions that are already on the new
> tablespace and only process the ones on the previous tablespace? It
> seems to me that the first scenario makes the most sense as currently
> a REINDEX works on all the relations defined, though there could be
> use cases for the second case. This should be documented, I think.

I agree that follow-up REINDEX should also reindex moved partitions,
since REINDEX (TABLESPACE ...) is still reindex at first. I will try to
put something about this part into the docs. Also I think that we cannot
be sure that nothing happened with already reindexed partitions between
two consequent REINDEX calls.

> There are no tests for partitioned tables, aka we'd want to make sure
> that the new partitioned index is on the correct tablespace, as well
> as all its leaves. It may be better to have at least two levels of
> partitioned tables, as well as a partitioned table with no leaves in
> the cases dealt with.

Yes, sure, it makes sense.

> + *
> + * Even if a table's indexes were moved to a new tablespace,
> the index
> + * on its toast table is not normally moved.
> */
> Still, REINDEX (TABLESPACE) TABLE should move all of them to be
> consistent with ALTER TABLE SET TABLESPACE, but that's not the case
> with this code, no? This requires proper test coverage, but there is
> nothing of the kind in this patch.

You are right, we do not move TOAST indexes now, since
IsSystemRelation() is true for TOAST indexes, so I thought that we
should not allow moving them without allow_system_table_mods=true. Now I
wonder why ALTER TABLE does that.

I am going to attach the new version of patch set today or tomorrow.

Alexey Kondratov

Postgres Professional
Russian Postgres Company

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2021-01-21 14:14:10 Re: patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL
Previous Message Pavel Stehule 2021-01-21 13:37:39 Re: patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL