| From: | "Joel Jacobson" <joel(at)compiler(dot)org> |
|---|---|
| To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | "Tender Wang" <tndrwang(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq |
| Date: | 2026-03-05 06:17:29 |
| Message-ID: | 58a749a2-71a0-4d48-b70b-c9a6e3e54ae6@app.fastmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Mar 4, 2026, at 21:50, Tom Lane wrote:
> "Joel Jacobson" <joel(at)compiler(dot)org> writes:
>> On Tue, Mar 3, 2026, at 16:31, Tom Lane wrote:
>>> This reminds me of the unfinished business at [1]. We really ought
>>> to make it true that nulls never get into the hash table before
>>> we assume that's so in costing.
>
>> Hmm, OK, so there are cases when we don't discard NULLs when we should
>> be able to? I was reading these lines in nodeHash.c and thought we would
>> always be discarding them when possible:
>
>> if (!isnull)
>> {
>> ...
>> }
>> else if (node->keep_null_tuples)
>> {
>> /* null join key, but we must save tuple to be emitted later */
>> ...
>> }
>> /* else we can discard the tuple immediately */
>
> I'm confused ... that keep_null_tuples bit appears nowhere in HEAD,
> but it does appear in the patch at [1].
Oh, sorry, I was looking at nodeHash.c with [1] applied.
I recalled seeing some `if (!isnull)` code, must have been this code:
if (!isnull)
ExecParallelHashTableInsert(hashtable, slot, hashvalue);
> Anyway, the short answer is that we discard NULLs if possible, but
> it's not possible when doing an outer join that requires returning
> null-extended rows from the hashed side.
Thanks for explaining.
> I've now pushed the patch we were discussing before, and all that's
> left to worry about (AFAIK) in estimate_hash_bucket_stats is its
> handling of null join keys.
Nice!
> I'd prefer to get the other patch
> in before worrying more about that.
Makes sense.
>
> regards, tom lane
>
> [1]
> https://www.postgresql.org/message-id/flat/3061845.1746486714%40sss.pgh.pa.us
/Joel
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alexander Lakhin | 2026-03-05 06:30:00 | Re: Shutdown indefinitely stuck due to unflushed FPI_FOR_HINT record |
| Previous Message | Andrey Borodin | 2026-03-05 06:00:24 | Re: amcheck: add index-all-keys-match verification for B-Tree |