| From: | Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Hash aggregate collisions cause excessive spilling |
| Date: | 2026-02-19 18:24:17 |
| Message-ID: | CANwKhkOZuf8ychD3b=8-+-hxyu+OjOCU56AesbHiDqpRmARc0w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, 19 Feb 2026 at 20:07, Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> wrote:
> I was thinking more along the lines of hashing together the pointer
> value and worker number. But something more deterministic would indeed
> be better. How about this?
>
> --- a/src/backend/executor/execGrouping.c
> +++ b/src/backend/executor/execGrouping.c
> @@ -201,3 +201,3 @@ BuildTupleHashTable(PlanState *parent,
> MemoryContext oldcontext;
> - uint32 hash_iv = 0;
> + uint32 hash_iv = parent->plan->plan_node_id;
I can confirm that this fixes the issue. A standalone reproducer is here:
create table data as select random(1,1000000) from generate_series(1,10000000);
vacuum analyze data;
set enable_gathermerge = off;
explain analyze select distinct random from data;
Regards,
Ants Aasma
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2026-02-19 18:27:10 | Re: Adding locks statistics |
| Previous Message | Sami Imseih | 2026-02-19 18:23:35 | Re: Optional skipping of unchanged relations during ANALYZE? |