Re: analyze-in-stages post upgrade questions

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, "Zechman, Derek S" <Derek(dot)S(dot)Zechman(at)snapon(dot)com>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: analyze-in-stages post upgrade questions
Date: 2025-06-27 21:39:17
Message-ID: 4d8122febd3007143504e4b6034b4253f7000761.camel@cybertec.at
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, 2025-06-27 at 08:31 -0700, Adrian Klaver wrote:
> On 6/27/25 06:35, Zechman, Derek S wrote:
> > We recently performed an upgrade from pg14 (14.18) to pg16 (16.9) and
> > performed the analyze-in-stages post upgrade.  It has been noticed that
> > some plans changed to use hash joins instead of nested loops.  Further
> > investigation found it was because the parent table of partitioned
> > tables did not have stats.  After running an ANALYZE on the parent
> > tables we got similar plan an execution times as before.
> >
> > I have two questions
> >
> > 1 - Why does analyze-in-stages not analyze the parent tables?
> >
> > 2 – What happens if we do not run analyze-in-stages post upgrade and
> > just run an analyze?
>
> It is spelled out in the docs:
>
> https://www.postgresql.org/docs/current/pgupgrade.html
>
> Emphasis added
>
> "Using vacuumdb --all --analyze-only can efficiently generate such
> statistics, and the use of --jobs can speed it up. Option
> --analyze-in-stages can be used to generate **minimal statistics**
> quickly. If vacuum_cost_delay is set to a non-zero value, this can be
> overridden to speed up statistics generation using PGOPTIONS, e.g.,
> PGOPTIONS='-c vacuum_cost_delay=0' vacuumdb ...."
>
> and from here:
>
> https://www.postgresql.org/docs/current/app-vacuumdb.html
>
> "--analyze-in-stages
>
> Only calculate statistics for use by the optimizer (no vacuum),
> like --analyze-only. Run three stages of analyze; the first stage uses
> the lowest possible statistics target (see default_statistics_target) to
> produce usable statistics faster, and subsequent stages build the full
> statistics.
>
> This option is only useful to analyze a database that currently has
> no statistics or has wholly incorrect ones, such as if it is newly
> populated from a restored dump or by pg_upgrade. Be aware that running
> with this option in a database with existing statistics may cause the
> query optimizer choices to become transiently worse due to the low
> statistics targets of the early stages.

Well, that wouldn't explain why it doesn't work on partitioned tables.
I am under the impression that it should.

Derek, can cou share the pg_stats entries for the partitioned table?

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Zechman, Derek S 2025-06-28 01:11:19 RE: analyze-in-stages post upgrade questions
Previous Message raphi 2025-06-27 17:00:36 Re: password rules