Re: Optional skipping of unchanged relations during ANALYZE?

From: VASUKI M <vasukianand0119(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Robert Treat <rob(at)xzilla(dot)net>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org>
Subject: Re: Optional skipping of unchanged relations during ANALYZE?
Date: 2026-01-29 12:19:47
Message-ID: CAE2r8H5ZYiFxYzhWDAyHO1jxSn7oB-9hPV8VdhAkkJjCwfg0HQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

I am back with a new patch.

As a follow-up to the previous discussion, I’m posting a revised v3 patch
that introduces an opt-in ANALYZE (MISSING_STATS) option.

The goal is to reduce unnecessary work when running manual ANALYZE over
many relations, while keeping the default behavior unchanged.

Overview:
ANALYZE (MISSING_STATS) analyzes only relations that currently have no
statistics entries in pg_statistic. Relations that already have statistics
are skipped.

This is conceptually similar to vacuumdb --missing-stats-only, but exposed
at the SQL ANALYZE level for interactive and scripted use.

The decision is intentionally table-level and conservative:

- If a relation has at least one pg_statistic entry, it is considered to
“have stats” and may be skipped.
- If no pg_statistic rows exist (new table, stats removed, crash reset),
the relation is analyzed.

No thresholds or modification counters are used in this option.

Behavior summary for ANALYZE(MISSING_STATS);

New empty table------------------------------->analyzed
Re-run on empty table------------------------> analyzed
Table with data but no stats-----------------> analyzed
Re-run after stats exist-----------------------> skipped
Add new column after ANALYZE----------> analyzed
Re-run after column stats exist-------------> skipped
Statistics manually deleted (pg_statistic) -> analyzed
Statistics lost after crash recovery-----------> analyzed
Regular ANALYZE -------------------------------> unchanged behavior

This ensures that ANALYZE (MISSING_STATS) converges toward a no-op on
subsequent runs, while still recovering from missing or invalid statistics.

Scope and limitations:
- Applies only to regular relations.
- Uses pg_statistic directly (not pg_stats or pg_stat views).
- Does not consider modification thresholds or autovacuum heuristics.
- Partitioned tables, inheritance, and extended statistics are not handled
yet and can be considered separately.

I would appreciate feedback on:

- Whether this behavior and naming align with expectations.
- Any edge cases I may have missed.
- Whether this is a reasonable first step before considering more advanced
options (e.g., modified-stats thresholds).

While testing i have noted this :
analyze_test=# ALTER TABLE ms1 ADD COLUMN b int;
ALTER TABLE
Time: 44.665 ms
analyze_test=# ANALYZE (MISSING_STATS);
DEBUG: ANALYZE processing relation "ms1" (OID 32791)
analyze_test=# SELECT attname
FROM pg_statistic s
JOIN pg_class c ON c.oid = s.starelid
JOIN pg_attribute a
ON a.attrelid = c.oid AND a.attnum = s.staattnum
WHERE c.relname = 'ms1'
ORDER BY attname;
attname
---------
a
b
(2 rows)

Time: 1.390 ms
analyze_test=# ANALYZE (MISSING_STATS);
DEBUG: ANALYZE (MISSING_STATS): skipping relation "ms1" (OID 32791)
analyze_test=# SELECT
a.attname,
s.stanullfrac,
s.stadistinct
FROM pg_statistic s
JOIN pg_class c ON c.oid = s.starelid
JOIN pg_attribute a
ON a.attrelid = c.oid AND a.attnum = s.staattnum
WHERE c.relname = 'ms1';
attname | stanullfrac | stadistinct
---------+-------------+-------------
a | 0 | -1
b | 1 | 0
(2 rows)

Time: 0.733 ms

Note:
[1] The empty table[if a table has 0 rows] --> no pg_statistic rows at all
[2] If a table has >= 1 row then postgres creates pg_statistic rows for
every user column,even if the
- Column is entirely null
- The column was added later
- The column has never had a non-null values

Thanks for your time and review, I will post the next patch for
modified_stats shortly.

*Vasuki MC-DAC,Chennai*

Attachment Content-Type Size
v3-0001-ANALYZE-add-MISSING_STATS-option.patch text/x-patch 8.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Xuneng Zhou 2026-01-29 12:22:17 Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery
Previous Message Nikolay Shaplov 2026-01-29 11:58:52 Re: Custom oauth validator options