Re: Hash aggregate collisions cause excessive spilling

From: Andres Freund <andres(at)anarazel(dot)de>
To: Ants Aasma <ants(dot)aasma(at)cybertec(dot)at>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hash aggregate collisions cause excessive spilling
Date: 2026-02-19 19:28:58
Message-ID: nozry5gmmyddh3qq3o6x7ofqkbcrzfijd5s266polwaa7gddhz@kybgvdlkglzl
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-02-19 14:01:15 -0500, Andres Freund wrote:
> The hash functions of some types are ... peculiar (partially due to
> cross-type hashjoin support).

In case you want a reference for said peculiarity, look at this:

=# SELECT lower, upper, lower | upper as V, hashint8(lower | upper) FROM (SELECT (i) as lower, ((i)::int8 << 32) as upper FROM generate_series(0, 5) g(i));
┌───────┬─────────────┬─────────────┬────────────┐
│ lower │ upper │ v │ hashint8 │
├───────┼─────────────┼─────────────┼────────────┤
│ 0 │ 0 │ 0 │ -272711505 │
│ 1 │ 4294967296 │ 4294967297 │ -272711505 │
│ 2 │ 8589934592 │ 8589934594 │ -272711505 │
│ 3 │ 12884901888 │ 12884901891 │ -272711505 │
│ 4 │ 17179869184 │ 17179869188 │ -272711505 │
│ 5 │ 21474836480 │ 21474836485 │ -272711505 │
└───────┴─────────────┴─────────────┴────────────┘
(6 rows)

Which obviously could have some ... fun consequences.

Whoever chose to implement the cross type compatibility by
a) using int4 as the hash "domain", instead of always hashing 8 bytes
b) implementing int8 by doubling up the high 32bits into the lower 32 bits,
instead of hashint4 iff the value value fits into 32bits and the full 8
bytes otherwise

did us really no service.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2026-02-19 19:29:52 Re: Consider low startup cost in add_partial_path
Previous Message Tom Lane 2026-02-19 19:19:33 Re: assume availability of "inline" keyword