Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

From: Jose Luis Tallon <jltallon(at)adv-solutions(dot)net>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly
Date: 2019-09-20 08:26:21
Message-ID: f3031524-9bcd-4078-b6b4-caffd076de97@adv-solutions.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20/9/19 4:06, Michael Paquier wrote:
> On Thu, Sep 19, 2019 at 05:40:41PM +0300, Alexey Kondratov wrote:
>> On 19.09.2019 16:21, Robert Haas wrote:
>>> So, earlier in this thread, I suggested making this part of ALTER
>>> TABLE, and several people seemed to like that idea. Did we have a
>>> reason for dropping that approach?
> Personally, I don't find this idea very attractive as ALTER TABLE is
> already complicated enough with all the subqueries we already support
> in the command, all the logic we need to maintain to make combinations
> of those subqueries in a minimum number of steps, and also the number
> of bugs we have seen because of the amount of complication present.

Yes, but please keep the other options: At it is, cluster, vacuum full
and reindex already rewrite the table in full; Being able to write the
result to a different tablespace than the original object was stored in
enables a whole world of very interesting possibilities.... including a
quick way out of a "so little disk space available that vacuum won't
work properly" situation --- which I'm sure MANY users will appreciate,
including me

> If we add this option to REINDEX, then for 'ALTER TABLE tb_name action1,
> REINDEX SET TABLESPACE tbsp_name, action3' action2 will be just a direct
> alias to 'REINDEX TABLE tb_name SET TABLESPACE tbsp_name'. So it seems
> practical to do this for REINDEX first.
>
> The only one concern I have against adding REINDEX to ALTER TABLE in this
> context is that it will allow user to write such a chimera:
>
> ALTER TABLE tb_name REINDEX SET TABLESPACE tbsp_name, SET TABLESPACE
> tbsp_name;
>
> when they want to move both table and all the indexes. Because simple
> ALTER TABLE tb_name REINDEX, SET TABLESPACE tbsp_name;
> looks ambiguous. Should it change tablespace of table, indexes or both?

Indeed.

IMHO, that form of the command should not allow that much flexibility...
even on the "principle of least surprise" grounds :S

That is, I'd restrict the ability to change (output) tablespace to the
"direct" form --- REINDEX name, VACUUM (FULL) name, CLUSTER name ---
whereas the ALTER table|index SET TABLESPACE would continue to work.

Now that I come to think of it, maybe saying "output" or "move to"
rather than "set tablespace" would make more sense for this variation of
the commands? (clearer, less prone to confusion)?

> Tricky question, but we don't change the tablespace of indexes when
> using an ALTER TABLE, so I would say no on compatibility grounds.
> ALTER TABLE has never touched the tablespace of indexes, and I don't
> think that we should begin to do so.

Indeed.

I might be missing something, but is there any reason to not *require* a
explicit transaction for the above multi-action commands? I mean, have
it be:

BEGIN;

ALTER TABLE tb_name SET TABLESPACE tbsp_name;    -- moves the table ....
but possibly NOT the indexes?

ALTER TABLE tb_name REINDEX [OUTPUT TABLESPACE tbsp_name];    --
REINDEX, placing the resulting index on tbsp_name instead of the
original one

COMMIT;

... and have the parser/planner combine the steps if it'd make sense (it
probably wouldn't in this example)?

Just my .02€

Thanks,

    / J.L.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2019-09-20 09:02:52 Re: A problem about partitionwise join
Previous Message Fabien COELHO 2019-09-20 08:10:58 Re: psql - add SHOW_ALL_RESULTS option