Re: A few new options for vacuumdb

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A few new options for vacuumdb
Date: 2018-12-20 02:05:22
Message-ID: 20181220020522.GM27104@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 19, 2018 at 08:50:10PM +0000, Bossart, Nathan wrote:
> If an option is specified for a server version that is not supported,
> the option is silently ignored. For example, SKIP_LOCKED was only
> added to VACUUM and ANALYZE for v12. Alternatively, I think we could
> fail in vacuum_one_database() if an unsupported option is specified.
> Some of these options will work on all currently supported versions,
> so I am curious what others think about skipping some of these version
> checks altogether.

prepare_vacuum_command() already handles that by ignoring silently
unsupported options (see the case of SKIP_LOCKED). So why not doing the
same?

> It does not seem clear whether the user wants us to process mytable
> only if it is at least 1 GB, or if we should process mytable in
> addition to any other relations over 1 GB. Either way, I think trying
> to support these combinations of options adds more complexity than it
> is worth.

It seems to me that a combination of both options means that the listed
table should be processed only if its minimum size is 1GB. If multiple
tables are specified with --table, then only those reaching 1GB would be
processed. So this restriction can go away. The same applies for the
proposed --min-xid-age and --min-mxid-age.

+ <para>
+ Only execute the vacuum or analyze commands on tables with a multixact
+ ID age of at least <replaceable
class="parameter">mxid_age</replaceable>.
+ </para>
Adding a link to explain the multixact business may be helpful, say
vacuum-for-multixact-wraparound. Same comment for XID.

> 0001 is a minor fix that is somewhat separate from these new options,
> although the new options will make the edge case it aims to fix much
> easier to reach. When the catalogs are queried in parallel mode to
> get the list of tables to process, we currently assume that at least
> one table will be returned. If no tables are found, the tables
> variable will stay as NULL, which leads to database-wide VACUUM or
> ANALYZE commands. Since there are currently no user-configurable
> options available for this catalog query, this case is likely
> exceptionally rare. However, with the new options, it is much easier
> to inadvertently filter out all relations.

Agreed. No need to visibly bother about that in back-branches.

There is an argument about also adding DISABLE_PAGE_SKIPPING.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2018-12-20 02:59:43 Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Previous Message Tatsuro Yamada 2018-12-20 01:47:30 Re: Tab completion for ALTER INDEX|TABLE ALTER COLUMN SET STATISTICS