Re: Hash support for arrays

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: marcin mank <marcin(dot)mank(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hash support for arrays
Date: 2010-11-02 18:07:04
Message-ID: AANLkTiniSLu0HE4dQJk3VApcytWBe+XLnJUPshWz-QH4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Oct 30, 2010 at 10:01 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> marcin mank <marcin(dot)mank(at)gmail(dot)com> writes:
>> On Sat, Oct 30, 2010 at 6:21 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> 3. To hash, apply the element type's hash function to each array
>>> element.  Combine these values by rotating the accumulator left
>>> one bit at each step and xor'ing in the next element's hash value.
>>>
>>> Thoughts?  In particular, is anyone aware of a better method
>>> for combining the element hash values?
>
>> This would make the hash the same for arrays with elements 32 apart swapped.
>
> Well, there are *always* going to be cases where you get the same hash
> value for two different inputs; it's unavoidable given that you have to
> combine N 32-bit hash values into only one 32-bit output.

Sure. The goal is to make those hard to predict, though. I think
"multiply by 31 and add the next value" is a fairly standard way of
getting that behavior. It mixes up the bits a lot more than just
left-shifting by a variable offset.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2010-11-02 18:17:16 Re: ALTER TYPE recursion to typed tables
Previous Message Robert Haas 2010-11-02 17:57:26 Re: create custom collation from case insensitive portuguese