| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Hash aggregate collisions cause excessive spilling |
| Date: | 2026-02-19 17:30:07 |
| Message-ID: | vx4azu62rgrnkt4oauviepbydxj5q7wbtzycwmqnmby2sfpvwc@xfvp3pcjnv2w |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On 2026-02-19 19:06:04 +0200, Ants Aasma wrote:
> >
> > /*
> > * If parallelism is in use, even if the leader backend is performing the
> > * scan itself, we don't want to create the hashtable exactly the same way
> > * in all workers. As hashtables are iterated over in keyspace-order,
> > * doing so in all processes in the same way is likely to lead to
> > * "unbalanced" hashtables when the table size initially is
> > * underestimated.
> > */
> > if (use_variable_hash_iv)
> > hash_iv = murmurhash32(ParallelWorkerNumber);
> >
> >
> > I don't remember enough of how the parallel aggregate stuff works. Perhaps the
> > issue is that the leader is also building a hashtable and it's being inserted
> > into the post-gather hashtable, using the same IV?
> >
> > In which case parallel_leader_participation=off should make a difference.
>
> After turning leader participation off the problem no longer
> reproduced even after 10 iterations, turning it back on it reproduced
> on the 4th iteration. Is there any reason why the hash table couldn't
> have an unconditional iv that includes the plan node?
You mean just use the numerical value of the pointer? I think that'd be pretty
likely to be the same between parallel workers. And I think it's not great for
benchmarking / debugging if every run ends up with a different IV.
But we certainly should do something about the IV for the leader in these
cases.
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Dmitry Dolgov | 2026-02-19 17:44:57 | Add ssl_(supported|shared)_groups to sslinfo |
| Previous Message | Nathan Bossart | 2026-02-19 17:20:44 | assume availability of "inline" keyword |