BUG #17787: Seriously wrong value of pg_class.reltuples for FTS GIN index after VACUUM (INDEX_CLEANUP)

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: maxim(dot)boguk(at)gmail(dot)com
Subject: BUG #17787: Seriously wrong value of pg_class.reltuples for FTS GIN index after VACUUM (INDEX_CLEANUP)
Date: 2023-02-12 09:46:19
Message-ID: 17787-b2dbe62bdfabd467@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17787
Logged by: Maxim Boguk
Email address: maxim(dot)boguk(at)gmail(dot)com
PostgreSQL version: 14.5
Operating system: Ubuntu
Description:

Hi,

On one of production system our in-house index bloat monitoring started
behave weird.
After some research I found that the value of pg_class.reltuples for FTS GIN
index could have two different values with 2 order magnitude difference
between them depends of what had been performed last on the table
(auto)vacuum (with index cleanup) or (auto)analyze.

Now what I have on production:
analyze verbose processor_callbacks;

select count(*) from processor_callbacks;
count
----------
33439573

select reltuples from pg_class where relname='payload_index';
reltuples
---------------
3.3456344e+07

So far seems ok... now after:
VACUUM (verbose, INDEX_CLEANUP ON) "public"."processor_callbacks";
the database produce:
select reltuples from pg_class where relname='payload_index';
reltuples
---------------
3.5565809e+09

After analyze verbose processor_callbacks; value return to original
value:
analyze processor_callbacks;
select reltuples from pg_class where relname='payload_index';
reltuples
---------------
3.3622728e+07

Every time I run analyze on the table - I have 3.3e+07 reltupes for this
index,
every time I run VACUUM (INDEX_CLEANUP ON) on the table - I have 3.5e+09
reltupes for this index (e.g. issue absolutely repeatable).

The table (and index) structure:
\d+ processor_callbacks
Table
"public.processor_callbacks"
Column | Type | Collation | Nullable |
Default | Storage | Compression | Stats target |
Description
-----------------+-----------------------------+-----------+----------+----------------------------------+----------+-------------+--------------+-------------
id | integer | | not null |
generated by default as identity | plain | | |

...
payload | json | | not null |
| extended | | |
Indexes:
"processor_callbacks_pkey" PRIMARY KEY, btree (id)
"payload_index" gin (to_tsvector('english'::regconfig, payload ->>
'params'::text))
...

I think first value better represents realty, but second on could have sense
as well (depend of how to define "tupes" in GIN index I suspect),
however both of them cannot be correct in same time.

While testing on small sample of 100rows I wasn't able to reproduce that
behavior, but it easily reproducible with vacuum full:

create table test as select payload from processor_callbacks order by id
desc limit 100;
create index test_gin on test USING GIN (to_tsvector('english'::regconfig,
payload ->> 'params'::text));
analyze test;
select reltuples from pg_class where relname='test_gin';
reltuples
-----------
100
vacuum FULL test;
select reltuples from pg_class where relname='test_gin';
reltuples
-----------
15834
analyze test;
select reltuples from pg_class where relname='test_gin';
reltuples
-----------
100
And so on...

PS:
I suspect that the issue could be related to the following discussion:
https://www.postgresql.org/message-id/flat/CAD21AoAA7%2BETUJo%3Dj2L8KAdKF8Q9_5uqwNx6H8rucFm6aRZSBA%40mail.gmail.com#2f06043f394a835c993ebf23a2af1183

Kind Regards,
Maxim

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2023-02-12 10:00:01 BUG #17788: Incorrect memory access when parsing empty string as sql_standard interval
Previous Message Dmitry Dolgov 2023-02-11 19:00:37 Re: BUG #17774: Assert triggered on brin_minmax_multi.c