RE: analyze-in-stages post upgrade questions

From: "Zechman, Derek S" <Derek(dot)S(dot)Zechman(at)snapon(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: RE: analyze-in-stages post upgrade questions
Date: 2025-06-28 01:23:59
Message-ID: PH0PR04MB82943D67CD179002714AA503C044A@PH0PR04MB8294.namprd04.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

> > We recently performed an upgrade from pg14 (14.18) to pg16 (16.9) and

> > performed the analyze-in-stages post upgrade. It has been noticed that

> > some plans changed to use hash joins instead of nested loops. Further

> > investigation found it was because the parent table of partitioned

> > tables did not have stats. After running an ANALYZE on the parent

> > tables we got similar plan an execution times as before.

> >

> > I have two questions

> >

> > 1 - Why does analyze-in-stages not analyze the parent tables?

> >

> > 2 – What happens if we do not run analyze-in-stages post upgrade and

> > just run an analyze?

>

> It is spelled out in the docs:

>

> https://urldefense.com/v3/__https://www.postgresql.org/docs/current/pgupgrade.html__;!!Lf_9VycLqA!jU5QgA1re0Txg_h2dD1N3XvK_l8hYdyMvpcxrLL5GDyQ5qN4aGQcxmDE8qmaV_p5telTzYmOL6S3fR8eRc0s_8UvnFFR6Ws$<https://urldefense.com/v3/__https:/www.postgresql.org/docs/current/pgupgrade.html__;!!Lf_9VycLqA!jU5QgA1re0Txg_h2dD1N3XvK_l8hYdyMvpcxrLL5GDyQ5qN4aGQcxmDE8qmaV_p5telTzYmOL6S3fR8eRc0s_8UvnFFR6Ws$>

>

> Emphasis added

>

> "Using vacuumdb --all --analyze-only can efficiently generate such

> statistics, and the use of --jobs can speed it up. Option

> --analyze-in-stages can be used to generate **minimal statistics**

> quickly. If vacuum_cost_delay is set to a non-zero value, this can be

> overridden to speed up statistics generation using PGOPTIONS, e.g.,

> PGOPTIONS='-c vacuum_cost_delay=0' vacuumdb ...."

>

> and from here:

>

> https://urldefense.com/v3/__https://www.postgresql.org/docs/current/app-vacuumdb.html__;!!Lf_9VycLqA!jU5QgA1re0Txg_h2dD1N3XvK_l8hYdyMvpcxrLL5GDyQ5qN4aGQcxmDE8qmaV_p5telTzYmOL6S3fR8eRc0s_8UvvC7gfd0$<https://urldefense.com/v3/__https:/www.postgresql.org/docs/current/app-vacuumdb.html__;!!Lf_9VycLqA!jU5QgA1re0Txg_h2dD1N3XvK_l8hYdyMvpcxrLL5GDyQ5qN4aGQcxmDE8qmaV_p5telTzYmOL6S3fR8eRc0s_8UvvC7gfd0$>

>

> "--analyze-in-stages

>

> Only calculate statistics for use by the optimizer (no vacuum),

> like --analyze-only. Run three stages of analyze; the first stage uses

> the lowest possible statistics target (see default_statistics_target) to

> produce usable statistics faster, and subsequent stages build the full

> statistics.

>

> This option is only useful to analyze a database that currently has

> no statistics or has wholly incorrect ones, such as if it is newly

> populated from a restored dump or by pg_upgrade. Be aware that running

> with this option in a database with existing statistics may cause the

> query optimizer choices to become transiently worse due to the low

> statistics targets of the early stages.

Well, that wouldn't explain why it doesn't work on partitioned tables.

I am under the impression that it should.

Derek, can cou share the pg_stats entries for the partitioned table?

Yours,

Laurenz Albe

There are no entries in pg_stats for the parent table until after I manually run an analyze on it – Example below

=> select relname, reltuples, relkind from pg_class where relname ~ '^chapter_[0-9]+$' or relname='chapter' order by 1;

relname | reltuples | relkind

-------------+-----------+---------

chapter | -1 | p

chapter_1 | 4 | r

chapter_10 | 4 | r

chapter_100 | 30 | r

chapter_101 | 15 | r

chapter_102 | 15 | r

=> select count(*) from pg_stats where tablename='chapter';

count

-------

0

(1 row)

=> analyze chapter;

ANALYZE

=> select relname, reltuples, relkind from pg_class where relkind ='p' and relname='chapter';

relname | reltuples | relkind

---------+-----------+---------

chapter | 7589 | p

(1 row)

=> select count(*) from pg_stats where tablename='chapter';

count

-------

49

(1 row)

toy_epc_stg_1_db=>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Laurenz Albe 2025-06-28 05:25:30 Re: analyze-in-stages post upgrade questions
Previous Message Zechman, Derek S 2025-06-28 01:11:19 RE: analyze-in-stages post upgrade questions