Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq

From: Tender Wang <tndrwang(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Joel Jacobson <joel(at)compiler(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq
Date: 2026-03-01 04:14:40
Message-ID: CAHewXNn2Z-k_+af=E85gGYfsQNSuxph6bWq--rUZWFNxHKSb6w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 于2026年3月1日周日 11:53写道:
>
> Tender Wang <tndrwang(at)gmail(dot)com> writes:
> > In my previous email, I worried rel->tuples may be zero for an empty relation.
> > But here it's safe, because an empty relation has no tuples in pg_statistic.
>
> Not sure about that --- it seems possible that after a mass delete,
> VACUUM could update pg_class.reltuples to zero without touching
> pg_statistic.

Yeah, Possibly.

>And I also don't remember whether the planner clamps
> rel->tuples to be at least 1.

As far as I know, the planner only clamps rel->rows to be at least 1,
not clamps rel->tuples.

>But it doesn't matter. If rel->tuples
> is zero, the if-test will prevent us from dividing by zero, and then
> we'll leave *mcv_freq as zero meaning "unknown", which seems fine.
> It's the same thing that would have happened before bd3e3e9e5.

In my first email, I only replaced rel->rows in :
*mcv_freq = 1.0 / vardata.rel->rows;
I forgot to replace the rel->rows in the if-test, so I have a concern.
My mistake.

--
Thanks,
Tender Wang

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2026-03-01 04:46:31 Re: Modernize error message for malformed B-Tree tuple posting
Previous Message Tom Lane 2026-03-01 03:53:33 Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq