Re: Combining hash values

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Combining hash values
Date: 2016-08-01 22:19:22
Message-ID: CA+Tgmobpm8SxuB2Y4G672jx+xwZZmVmvZUPUKZ_3RYW1=5=KAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 1, 2016 at 11:27 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
>> On that subject, while looking at hashfunc.c, I spotted that
>> hashint8() has a very obvious deficiency, which causes disastrous
>> performance with certain inputs:
>
> Well, if you're trying to squeeze 64 bits into a 32-bit result, there
> are always going to be collisions somewhere.
>
>> I'd suggest using hash_uint32() for values that fit in a 32-bit
>> integer and hash_any() otherwise.
>
> Perhaps, but this'd break existing hash indexes. That might not be
> a fatal objection, but if we're going to put users through that
> I'd like to think a little bigger in terms of the benefits we get.
> I've thought for some time that we needed to move to 64-bit hash function
> results, because the size of problem that's reasonable to use a hash join
> or hash aggregation for keeps increasing. Maybe we should do that and fix
> hashint8 as a side effect.

Well, considering that Amit is working on makes hash indexes
WAL-logged in v10[1], this seems like an awfully good time to get any
breakage we want to do out of the way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] https://www.postgresql.org/message-id/CAA4eK1LfzcZYxLoXS874Ad0+S-ZM60U9bwcyiUZx9mHZ-KCWhw@mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-08-01 22:21:15 Re: Combining hash values
Previous Message Andres Freund 2016-08-01 22:18:29 Re: PostmasterContext survives into parallel workers!?