Re: Should HashSetOp go away

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Should HashSetOp go away
Date: 2025-11-02 16:26:06
Message-ID: 426377.1762100766@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> I had a look at the v2 patches. Mostly quibbles, but #4 seems like an oversight.

Thanks for reviewing!

> 1) For the following:
> + tuples_space = nentries * (MAXALIGN(SizeofMinimalTupleHeader) +
> + MAXALIGN(tupleWidth) +
> + MAXALIGN(additionalsize));
> If I'm not mistaken, technically that should be
> MAXALIGN(SizeofMinimalTupleHeader + tupleWidth) +
> MAXALIGN(additionalsize),

No, I think it's correct as written: the data payload of a tuple
must always start on a MAXALIGN boundary. As you say, it doesn't
matter as long as SizeofMinimalTupleHeader is 16, but I think this
way is formally correct. (It would matter more if we were trying
to account for tuples' null bitmaps ...)

> 2) Would it be better to reference the function name
> "buildSubPlanHash" instead of "above" in:
> + * Adjust the rowcount estimate in the same way as above, except that we

Done.

> 3) Quite a collection of naming styles here.

Yeah :-( ... we work in an old and none-too-consistent code base.
Do you have any specific suggestions about which of these functions
might fit its surroundings better with a different caps style?

> 4) I think this is missing a "/ SH_FILLFACTOR"
> + /* should be safe to convert to uint64 */
> + size = (uint64) nentries;
> i.e do what SH_CREATE does.

Oh! I think I got confused because some of that logic is in
SH_CREATE and some in SH_COMPUTE_SIZE :-(. Good catch.

> 5) Is it switching "Max(nbuckets, 1);" to "nbuckets" in
> hash_choose_num_buckets(). Looks like BuildTupleHashTable() will do
> that part for us.

Yeah, we could do that. I was trying to not touch more of nodeAgg
than I had to, but it seems sensible to not duplicate something
that BuildTupleHashTable will do.

v3 attached.

regards, tom lane

Attachment Content-Type Size
v3-0001-Improve-planner-s-estimates-of-tuple-hash-table-s.patch text/x-diff 18.5 KB
v3-0002-Change-long-numGroups-fields-to-be-Cardinality-i..patch text/x-diff 19.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-11-02 16:47:45 Re: C nitpick about pgwin32_dispatch_queued_signals()
Previous Message Bryan Green 2025-11-02 16:22:16 Re: Avoid overflow (src/backend/utils/adt/formatting.c)