Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq

From: Tender Wang <tndrwang(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Joel Jacobson <joel(at)compiler(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq
Date: 2026-03-01 03:40:11
Message-ID: CAHewXNnYQSCRQ9PaQyViBEB6UKC08nqCzE6YjNcZxuvbThRBgg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,
>Yeah, in my last email, I said I tried this way. But I worried that
>rel->tuples may be zero for an empty relation.
In my previous email, I worried rel->tuples may be zero for an empty relation.
But here it's safe, because an empty relation has no tuples in pg_statistic.
So it will not enter if (HeapTupleIsValid(vardata.statsTuple)).
Sorry for the noise.

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 于2026年3月1日周日 08:08写道:

> Hmm ... doesn't this contradict your argument that avgfreq and
> mcv_freq need to be calculated on the same basis? Admittedly
> that was just a heuristic, but I'm not seeing why it's wrong.
>

Agree

> > The reason for this is that estfract is calculated as:
> > estfract = 1.0 / ndistinct;
> > where ndistinct has been adjusted to account for restriction clauses.
> > Therefore, we must also use the adjusted avgfreq when adjusting
> > estfract here:
>
> It feels like that might end up double-counting the effects of
> the restriction clauses.
>
> Anyway, we all seem to agree that s/rel->rows/rel->tuples/ is the
> correct fix for a newly-introduced bug. I'm inclined to proceed
> by committing that fix (along with any regression test fallout)
> and then investigating the avgfreq change as an independent matter.

+1

--
Thanks,
Tender Wang

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2026-03-01 03:53:33 Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq
Previous Message vignesh C 2026-03-01 03:11:30 Re: Skipping schema changes in publication