| From: | Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Hash aggregate collisions cause excessive spilling |
| Date: | 2026-02-28 15:21:28 |
| Message-ID: | CANwKhkOMbFeFwmzB0zoFDXKYMV-xM8===0LD3c4DHJ1wht3=6w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, 19 Feb 2026 at 21:01, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2026-02-19 20:24:17 +0200, Ants Aasma wrote:
> > On Thu, 19 Feb 2026 at 20:07, Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> wrote:
> > > I was thinking more along the lines of hashing together the pointer
> > > value and worker number. But something more deterministic would indeed
> > > be better. How about this?
> > >
> > > --- a/src/backend/executor/execGrouping.c
> > > +++ b/src/backend/executor/execGrouping.c
> > > @@ -201,3 +201,3 @@ BuildTupleHashTable(PlanState *parent,
> > > MemoryContext oldcontext;
> > > - uint32 hash_iv = 0;
> > > + uint32 hash_iv = parent->plan->plan_node_id;
> >
> > I can confirm that this fixes the issue. A standalone reproducer is here:
>
> I think this needs should use something that smears the bits from the plan_id
> more widely. The hash functions of some types are ... peculiar (partially due
> to cross-type hashjoin support).
>
> I'd also combine it with use_variable_hash_iv, rather than just have
> use_variable_hash_iv overwrite it.
This problem should only affect cases where multiple hash tables worth
of tuples are trying to fit into one. I think right now that limits it
to hash aggregates with parallel query. ExecBuildHash32FromAttrs
comment says that we should be using non-zero init_value only when
needed for a marginal performance gain. So I limited the plan node id
inclusion to use_variable_hash_iv, but made it unconditional for
aggregates.
I used hash_combine to smear together the bits of the two. I think
that should be good enough. Alternatively a xor of murmurhash32 of the
two should be better.
There is also a call to ExecBuildHash32FromAttrs from ExecInitSubPlan
that specifies a hash_iv for a hashed sub plan execution. This must
match the value BuildTupleHashTable decides on. Worth adding a
comment?
Regards,
Ants Aasma
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Decorrelate-nested-hash-tables.patch | text/x-patch | 2.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrey Borodin | 2026-02-28 15:26:05 | Re: [WiP] B-tree page merge during vacuum to reduce index bloat |
| Previous Message | Mihail Nikalayeu | 2026-02-28 15:16:00 | Re: Adding REPACK [concurrently] |