| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | "Joel Jacobson" <joel(at)compiler(dot)org> |
| Cc: | "Tender Wang" <tndrwang(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq |
| Date: | 2026-03-03 15:31:06 |
| Message-ID: | 1010506.1772551866@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
"Joel Jacobson" <joel(at)compiler(dot)org> writes:
> On Sun, Mar 1, 2026, at 22:12, Tom Lane wrote:
>> Aside: you could argue that failing to consider stanullfrac is wrong,
>> and maybe it is. But the more I looked at this code the more
>> convinced I got that it was only partially accounting for nulls
>> anyway. That seems like perhaps something to look into later.
> How about adjusting estfract for the null fraction before clamping?
This reminds me of the unfinished business at [1]. We really ought
to make it true that nulls never get into the hash table before
we assume that's so in costing. One of the things I was thinking
was being overlooked is the possibility of lots of nulls bloating
whichever hash bucket they get put in --- but if they aren't put
into a bucket then it's not wrong to ignore them here.
(Strictly speaking, that's still not so with non-strict hash operators,
but those are so rare that I don't mind not accounting for them.)
regards, tom lane
[1] https://www.postgresql.org/message-id/flat/3061845(dot)1746486714(at)sss(dot)pgh(dot)pa(dot)us
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Heikki Linnakangas | 2026-03-03 15:39:27 | Re: Refactor recovery conflict signaling a little |
| Previous Message | Tom Lane | 2026-03-03 15:22:03 | Re: Fix bug in multixact Oldest*MXactId initialization and access |