| From: | VASUKI M <vasukianand0119(at)gmail(dot)com> |
|---|---|
| To: | Sami Imseih <samimseih(at)gmail(dot)com> |
| Cc: | Robert Treat <rob(at)xzilla(dot)net>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org> |
| Subject: | Re: Optional skipping of unchanged relations during ANALYZE? |
| Date: | 2026-01-29 12:19:47 |
| Message-ID: | CAE2r8H5ZYiFxYzhWDAyHO1jxSn7oB-9hPV8VdhAkkJjCwfg0HQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi all,
I am back with a new patch.
As a follow-up to the previous discussion, I’m posting a revised v3 patch
that introduces an opt-in ANALYZE (MISSING_STATS) option.
The goal is to reduce unnecessary work when running manual ANALYZE over
many relations, while keeping the default behavior unchanged.
Overview:
ANALYZE (MISSING_STATS) analyzes only relations that currently have no
statistics entries in pg_statistic. Relations that already have statistics
are skipped.
This is conceptually similar to vacuumdb --missing-stats-only, but exposed
at the SQL ANALYZE level for interactive and scripted use.
The decision is intentionally table-level and conservative:
- If a relation has at least one pg_statistic entry, it is considered to
“have stats” and may be skipped.
- If no pg_statistic rows exist (new table, stats removed, crash reset),
the relation is analyzed.
No thresholds or modification counters are used in this option.
Behavior summary for ANALYZE(MISSING_STATS);
New empty table------------------------------->analyzed
Re-run on empty table------------------------> analyzed
Table with data but no stats-----------------> analyzed
Re-run after stats exist-----------------------> skipped
Add new column after ANALYZE----------> analyzed
Re-run after column stats exist-------------> skipped
Statistics manually deleted (pg_statistic) -> analyzed
Statistics lost after crash recovery-----------> analyzed
Regular ANALYZE -------------------------------> unchanged behavior
This ensures that ANALYZE (MISSING_STATS) converges toward a no-op on
subsequent runs, while still recovering from missing or invalid statistics.
Scope and limitations:
- Applies only to regular relations.
- Uses pg_statistic directly (not pg_stats or pg_stat views).
- Does not consider modification thresholds or autovacuum heuristics.
- Partitioned tables, inheritance, and extended statistics are not handled
yet and can be considered separately.
I would appreciate feedback on:
- Whether this behavior and naming align with expectations.
- Any edge cases I may have missed.
- Whether this is a reasonable first step before considering more advanced
options (e.g., modified-stats thresholds).
While testing i have noted this :
analyze_test=# ALTER TABLE ms1 ADD COLUMN b int;
ALTER TABLE
Time: 44.665 ms
analyze_test=# ANALYZE (MISSING_STATS);
DEBUG: ANALYZE processing relation "ms1" (OID 32791)
analyze_test=# SELECT attname
FROM pg_statistic s
JOIN pg_class c ON c.oid = s.starelid
JOIN pg_attribute a
ON a.attrelid = c.oid AND a.attnum = s.staattnum
WHERE c.relname = 'ms1'
ORDER BY attname;
attname
---------
a
b
(2 rows)
Time: 1.390 ms
analyze_test=# ANALYZE (MISSING_STATS);
DEBUG: ANALYZE (MISSING_STATS): skipping relation "ms1" (OID 32791)
analyze_test=# SELECT
a.attname,
s.stanullfrac,
s.stadistinct
FROM pg_statistic s
JOIN pg_class c ON c.oid = s.starelid
JOIN pg_attribute a
ON a.attrelid = c.oid AND a.attnum = s.staattnum
WHERE c.relname = 'ms1';
attname | stanullfrac | stadistinct
---------+-------------+-------------
a | 0 | -1
b | 1 | 0
(2 rows)
Time: 0.733 ms
Note:
[1] The empty table[if a table has 0 rows] --> no pg_statistic rows at all
[2] If a table has >= 1 row then postgres creates pg_statistic rows for
every user column,even if the
- Column is entirely null
- The column was added later
- The column has never had a non-null values
Thanks for your time and review, I will post the next patch for
modified_stats shortly.
*Vasuki MC-DAC,Chennai*
| Attachment | Content-Type | Size |
|---|---|---|
| v3-0001-ANALYZE-add-MISSING_STATS-option.patch | text/x-patch | 8.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Xuneng Zhou | 2026-01-29 12:22:17 | Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery |
| Previous Message | Nikolay Shaplov | 2026-01-29 11:58:52 | Re: Custom oauth validator options |