Re: vacuumdb/clusterdb/reindexdb: allow specifying objects to process in all databases

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: nathandbossart(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: vacuumdb/clusterdb/reindexdb: allow specifying objects to process in all databases
Date: 2023-06-29 05:16:26
Message-ID: 20230629.141626.886053266048932185.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 28 Jun 2023 16:24:02 -0700, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote in
> While working on some other patches, I found myself wanting to use the
> following command to vacuum the catalogs in all databases in a cluster:
>
> vacuumdb --all --schema pg_catalog
>
> However, this presently fails with the following error:
>
> cannot vacuum specific schema(s) in all databases
>
> AFAICT there no technical reason to block this, and the resulting behavior
> feels intuitive to me, so I wrote 0001 to allow it. 0002 allows specifying
> tables to process in all databases in clusterdb, and 0003 allows specifying
> tables, indexes, schemas, or the system catalogs to process in all
> databases in reindexdb.

It seems like useful.

> I debated also allowing users to specify different types of objects in the
> same command (e.g., "vacuumdb --schema myschema --table mytable"), but it
> looked like this would require a more substantial rewrite, and I didn't
> feel that the behavior was intuitive. For the example I just gave, does
> the user expect us to process both the "myschema" schema and the "mytable"
> table, or does the user want us to process the "mytable" table in the
> "myschema" schema? In vacuumdb, this is already blocked, but reindexdb

I think spcyfying the two at once is inconsistent if we maintain the
current behavior of those options.

It seems to me that that change clearly modifies the functionality of
the options. As a result, those options look like restriction
filters. For example, "vacuumdb -s s1_* -t t1" will vacuum all table
named "t1" in all schemas matches "s1_*".

> accepts combinations of tables, schemas, and indexes (yet disallows
> specifying --system along with other types of objects). Since this is
> inconsistent with vacuumdb and IMO ambiguous, I've restricted such
> combinations in 0003.
>
> Thoughts?

While I think this is useful, primarily for system catalogs, I'm not
entirely convinced about its practicality to user objects. It's
difficult for me to imagine that a situation where all databases share
the same schema would be major.

Assuming this is used for user objects, it may be necessary to safely
exclude databases that lack the specified schema or table, provided
the object present in at least one other database. But the exclusion
should be done with printing some warnings. It could also be
necessary to safely move to the next object when reindex or cluster
operation fails on a single object due to missing prerequisite
situations. But I don't think we might want to add such complexity to
these "script" tools.

So.. an alternative path might be to introduce a new option like
--syscatalog to specify system catalogs as the only option that can be
combined with --all. In doing so, we can leave the --table and
--schema options untouched.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-06-29 05:24:09 Re: harmonize password reuse in vacuumdb, clusterdb, and reindexdb
Previous Message Nathan Bossart 2023-06-29 04:38:58 Re: add \dpS to psq [16beta1]