Re: Optional skipping of unchanged relations during ANALYZE?

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: VASUKI M <vasukianand0119(at)gmail(dot)com>
Cc: Robert Treat <rob(at)xzilla(dot)net>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org>
Subject: Re: Optional skipping of unchanged relations during ANALYZE?
Date: 2026-01-29 21:14:18
Message-ID: CAA5RZ0uBeeM1psLPpWsw3rxRVpewKnMfs+jz+XAsrck7UW6YyQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry for the quick follow-up.

> > This is conceptually similar to vacuumdb --missing-stats-only, but exposed
> > at the SQL ANALYZE level for interactive and scripted use.
>
> What is being proposed in v3 is not similar in behavior.
>
> I am wondering if we should take the current SQL used by vacuumdb to
> find missing stats and perform direct syscache lookups in C?
>
> The benefit there is we can also change vacuumdb to use the ANALYZE
> (MISSING_STATS) option
> directly rather than deriving the SQL, and the MISSING_STATS behavior
> will be the same
> if we do this type of analyze from vacuumdb or manually via ANALYZE command.
>
> Attached is the query used by vacuumdb on HEAD.
>
> What do you think?

After looking at this a bit more, it occurred to me there are 2 routines in
analyze.c and extended_stats.c that are called examine_attribute(), and
their purpose is to check if an attribute should be analyzed and if so, return
a VacAttrStats.

```
/*
* examine_attribute -- pre-analysis of a single column
*
* Determine whether the column is analyzable; if so, create and initialize
* a VacAttrStats struct for it. If not, return NULL.
*/
static VacAttrStats *
examine_attribute(Node *expr)
```

```
/*
* examine_attribute -- pre-analysis of a single column
*
* Determine whether the column is analyzable; if so, create and initialize
* a VacAttrStats struct for it. If not, return NULL.
*
* If index_expr isn't NULL, then we're trying to analyze an expression index,
* and index_expr is the expression tree representing the column's data.
*/
static VacAttrStats *
examine_attribute(Relation onerel, int attnum, Node *index_expr)
```

So, we may just need to implement 2 new helpers that check if column
stats or extended
stats are missing from pg_statistic/pg_statistic_ext, and we can then
check the following

1/ if missing_stats_only flag is set to TRUE
2/ and examine_attribute returns VacAttrStats ( attribute is analyzable )
3/ and we do not have stats in pg_statististic or pg_statistic_ext
for the attribute ( the 2 routines can determine this )

The attribute should then be analyzed.

This should also work without much more effort if specific columns are
specified in the
ANALYZE command:

```
analyze table1 (col1, col2);
```

Another comment I have:

> - Whether this behavior and naming align with expectations.

I think the option name should be called MISSING_STATS_ONLY,
which is both clearer in intention and matches the option in vacuumdb.

--
Sami Imseih
Amazon Web Services (AWS)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2026-01-29 21:18:14 Re: Document NULL
Previous Message Masahiko Sawada 2026-01-29 20:47:30 Re: Refactor replication origin state reset helpers