Re: analyze-in-stages post upgrade questions

From: Mircea Cadariu <cadariu(dot)mircea(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "Zechman, Derek S" <Derek(dot)S(dot)Zechman(at)snapon(dot)com>, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: analyze-in-stages post upgrade questions
Date: 2025-08-06 04:01:18
Message-ID: 9c9cb744-b8c7-4d75-acad-595b3faba187@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hi,

On 30/07/2025 12:49, Fujii Masao wrote:
> I've started reviewing the patch since it's marked as ready for committer.
Thanks!
> Overall, I like the change. But I have one question: should this be treated as
> a bug fix that we back-patch to supported branches, or is it more of
> an improvement that should only go into master?
I reckon it might make sense to back-patch it to previous versions, as
users might not upgrade always to the latest version.
> Only calculate statistics for use by the optimizer (no vacuum).
> + If that option is specified, <command>vacuumdb</command> will also
> + process partitioned tables. Without that option, only the partitions
> + will be considered, unless a partitioned table is explicitly specified
> + with the <option>--table</option> option.
>
> This wording seems a bit out of place in the --analyze-only section,
> since it also describes the default behavior of vacuumdb without that option.
> Wouldn't it make more sense to move that explanation in the --table section?
>
> For example, we could add something like:
>
> ------------------
> If no tables are specified with the --table option, vacuumdb will
> clean all regular tables and materialized views in the connected
> database. If --analyze-only or --analyze-in-stages is also specified,
> it will analyze all regular tables, partitioned tables, and
> materialized views (but not foreign tables).
> ------------------
Yes, agreed.
> + /*
> + * VACUUMing partitioned tables would be unreasonably expensive, since
> + * that entails processing the partitions twice (once as part of the
> + * partitioned table, once as tables in their own right) for no
> + * benefit. But if we only ANALYZE, collecting statistics for
> + * partitioned tables is worth the effort.
> + */
>
> This is probably true. But isn't the main reason more about aligning with
> the behavior of the underlying VACUUM and ANALYZE commands? As the vacuumdb
> docs says, "There is no effective difference between vacuuming and analyzing
> databases via this utility and via other methods for accessing the server.",
> so its default target objects should match: VACUUM skips partitioned tables
> by default, while ANALYZE includes them. If that's the case, maybe the comment
> should reflect that instead.
I see what you mean. From that perspective, I wonder if we even need a
comment there at all.
> + qr/statement:\s+ANALYZE\s+public\.parent_table/s,
> + '--analyze_only updates statistics for partitioned tables');
>
> A plain space might be sufficient instead of \s+.
> Also, I don't think the backslash before ".parent_table" is necessary.

Good catch! Indeed let's simplify that to contain strictly only what's
necessary.

Kind regards,

Mircea Cadariu

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Simon Connah 2025-08-06 12:03:41 Stored procedures or raw queries
Previous Message Aleš Zelený 2025-08-05 15:25:32 Re: PGDG16 repository error - repomd.xml.asc is empty

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2025-08-06 04:11:16 Re: Fixed a minor typo in code comment
Previous Message shveta malik 2025-08-06 03:51:27 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart