| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Hash aggregate collisions cause excessive spilling |
| Date: | 2026-02-19 19:28:58 |
| Message-ID: | nozry5gmmyddh3qq3o6x7ofqkbcrzfijd5s266polwaa7gddhz@kybgvdlkglzl |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On 2026-02-19 14:01:15 -0500, Andres Freund wrote:
> The hash functions of some types are ... peculiar (partially due to
> cross-type hashjoin support).
In case you want a reference for said peculiarity, look at this:
=# SELECT lower, upper, lower | upper as V, hashint8(lower | upper) FROM (SELECT (i) as lower, ((i)::int8 << 32) as upper FROM generate_series(0, 5) g(i));
┌───────┬─────────────┬─────────────┬────────────┐
│ lower │ upper │ v │ hashint8 │
├───────┼─────────────┼─────────────┼────────────┤
│ 0 │ 0 │ 0 │ -272711505 │
│ 1 │ 4294967296 │ 4294967297 │ -272711505 │
│ 2 │ 8589934592 │ 8589934594 │ -272711505 │
│ 3 │ 12884901888 │ 12884901891 │ -272711505 │
│ 4 │ 17179869184 │ 17179869188 │ -272711505 │
│ 5 │ 21474836480 │ 21474836485 │ -272711505 │
└───────┴─────────────┴─────────────┴────────────┘
(6 rows)
Which obviously could have some ... fun consequences.
Whoever chose to implement the cross type compatibility by
a) using int4 as the hash "domain", instead of always hashing 8 bytes
b) implementing int8 by doubling up the high 32bits into the lower 32 bits,
instead of hashint4 iff the value value fits into 32bits and the full 8
bytes otherwise
did us really no service.
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2026-02-19 19:29:52 | Re: Consider low startup cost in add_partial_path |
| Previous Message | Tom Lane | 2026-02-19 19:19:33 | Re: assume availability of "inline" keyword |