Re: BUG #17205: pg_class.reltuples vakue wrong for brin indexes after vacuum (by many orders of magnitude)

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17205: pg_class.reltuples vakue wrong for brin indexes after vacuum (by many orders of magnitude)
Date: 2021-10-01 06:09:19
Message-ID: YVamD+7JWOOiCO39@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Sep 30, 2021 at 07:19:53AM +0000, PG Bug reporting form wrote:
> pg_class.reltuples wrong by many order of magnitude after (auto)vacuum table
> for brin indexes.
> Analyze table set corrent value back, and following vacuum set it wrong
> again (so correctness of value in production db depend on exact sequence of
> autovacuu/autoanalyze runs).
>
> mboguk=# vacuum t;
> VACUUM
> mboguk=# select reltuples from pg_class where relname='t_brin';
> reltuples
> -----------
> 345
> (1 row)

This information comes from the index cleanup done within vacuums,
gets updated during vac_update_relstats() and computed by each index
AM with their optional vacuum cleanup callback. For brin, that means
calling brinvacuumcleanup where the summarization of the page ranges
counts one tuple for each range tracked (see *numSummarized and
*numExisting pointing at the same area to make sure that no tuples are
missed), so this result of 345 is I guess correct.

> mboguk=# analyze t;
> ANALYZE
> mboguk=# select reltuples from pg_class where relname='t_brin';
> reltuples
> --------------
> 9.999977e+06
> (1 row)

This one is updated again with a vac_update_relstats(), but the number
of index tuples is derived from the number of the parent's relation
with the number updated based on a fractal number (here that would be
1.0 as the index is not partial). The thing is that we call
index_vacuum_cleanup() for analyze runs that are not part of VACUUM,
meaning that we do have the stats at hand, but just don't save them
back into each the pg_class entry of the impacted index.

It seems to me that we would finish with incorrect stats if we keep
calling blindly vac_update_relstats() all the time for an analyze
path. Even out-of-core AMs would not have their stats saved even if
they pushed some numbers into their IndexBulkDeleteResult(), numbers
that differ a bit with GIN actually, as it is the only in-core index
AM that does some work during an analyze-only operation.

Thoughts?
--
Michael

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2021-10-01 07:58:41 Re: pg_upgrade test for binary compatibility of core data types
Previous Message David G. Johnston 2021-10-01 05:16:07 Re: BUG #17208: Download fails when running 'yum install postgresql96'