Re: Hash support for arrays

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: marcin mank <marcin(dot)mank(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hash support for arrays
Date: 2010-11-02 20:42:19
Message-ID: 12251.1288730539@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Nov 2, 2010 at 11:21 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Really? I think "I don't understand when this fails" isn't obviously
>> better than being able to predict when it fails ...

> Isn't that the whole point of hash functions? Collisions are
> inevitable, but you want them to be unpredictable. If you want a hash
> function with predictable collision behavior, just has the first
> element. It'll be highly predictable AND wicked fast.

That seems like a rather poor straw man, since it suffers from exactly
the defect I'm complaining about, namely failing to consider all the
input values equally.

>> I'd be happier about this approach if there were some actual theory
>> behind it ... maybe there's some analysis out there, but the one link
>> that was given was just about entirely unconvincing.

> I think it's from Knuth, though unfortunately I don't have a copy to
> check. There are probably better algorithms out there, but this one's
> pretty simple.

I don't see anything in Knuth suggesting a multiplier of 31. His
recommendation for a multiplier, if you're going to use multiplicative
hashing, is wordsize/phi (phi being the golden ratio) ... and he also
wants you to keep the high order not the low order bits of the product.

However, this is largely beside the point, because that theory, as well
as the Java code you're arguing from, has to do with the initial hashing
of a raw sequence of input items. Not with combining some existing hash
values. The rotate-and-xor method I suggested for that is borrowed
exactly from section 6.4 of Knuth (page 512, in the first edition of
volume 3).

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kenneth Marshall 2010-11-02 20:46:18 Re: Hash support for arrays
Previous Message Kevin Grittner 2010-11-02 20:25:16 Re: Hash support for arrays