| From: | Sami Imseih <samimseih(at)gmail(dot)com> |
|---|---|
| To: | VASUKI M <vasukianand0119(at)gmail(dot)com> |
| Cc: | Robert Treat <rob(at)xzilla(dot)net>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org> |
| Subject: | Re: Optional skipping of unchanged relations during ANALYZE? |
| Date: | 2026-01-29 21:14:18 |
| Message-ID: | CAA5RZ0uBeeM1psLPpWsw3rxRVpewKnMfs+jz+XAsrck7UW6YyQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Sorry for the quick follow-up.
> > This is conceptually similar to vacuumdb --missing-stats-only, but exposed
> > at the SQL ANALYZE level for interactive and scripted use.
>
> What is being proposed in v3 is not similar in behavior.
>
> I am wondering if we should take the current SQL used by vacuumdb to
> find missing stats and perform direct syscache lookups in C?
>
> The benefit there is we can also change vacuumdb to use the ANALYZE
> (MISSING_STATS) option
> directly rather than deriving the SQL, and the MISSING_STATS behavior
> will be the same
> if we do this type of analyze from vacuumdb or manually via ANALYZE command.
>
> Attached is the query used by vacuumdb on HEAD.
>
> What do you think?
After looking at this a bit more, it occurred to me there are 2 routines in
analyze.c and extended_stats.c that are called examine_attribute(), and
their purpose is to check if an attribute should be analyzed and if so, return
a VacAttrStats.
```
/*
* examine_attribute -- pre-analysis of a single column
*
* Determine whether the column is analyzable; if so, create and initialize
* a VacAttrStats struct for it. If not, return NULL.
*/
static VacAttrStats *
examine_attribute(Node *expr)
```
```
/*
* examine_attribute -- pre-analysis of a single column
*
* Determine whether the column is analyzable; if so, create and initialize
* a VacAttrStats struct for it. If not, return NULL.
*
* If index_expr isn't NULL, then we're trying to analyze an expression index,
* and index_expr is the expression tree representing the column's data.
*/
static VacAttrStats *
examine_attribute(Relation onerel, int attnum, Node *index_expr)
```
So, we may just need to implement 2 new helpers that check if column
stats or extended
stats are missing from pg_statistic/pg_statistic_ext, and we can then
check the following
1/ if missing_stats_only flag is set to TRUE
2/ and examine_attribute returns VacAttrStats ( attribute is analyzable )
3/ and we do not have stats in pg_statististic or pg_statistic_ext
for the attribute ( the 2 routines can determine this )
The attribute should then be analyzed.
This should also work without much more effort if specific columns are
specified in the
ANALYZE command:
```
analyze table1 (col1, col2);
```
Another comment I have:
> - Whether this behavior and naming align with expectations.
I think the option name should be called MISSING_STATS_ONLY,
which is both clearer in intention and matches the option in vacuumdb.
--
Sami Imseih
Amazon Web Services (AWS)
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David G. Johnston | 2026-01-29 21:18:14 | Re: Document NULL |
| Previous Message | Masahiko Sawada | 2026-01-29 20:47:30 | Re: Refactor replication origin state reset helpers |