Re: Hash support for arrays

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hash support for arrays
Date: 2010-10-31 02:28:01
Message-ID: 10122.1288492081@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> I suppose you could use crc where our interface does allow incremental use.

Hm, that's a thought. I'm not sure though whether CRC really has the
behavior you want from a hash function, in particular that all the bits
are independent (a/k/a taking N low-order bits is a reasonable way to
cut the hash down to a suitable bucket index). Anybody know?

> I'm not really familiar enough with the math to know whether CRC is
> any better than your XOR mixing. I suspect they're pretty similar. I
> suppose if you have arrays of various lengths containing repetitions
> of a single value than the non-CRC would produce a hash of 0 for all
> arrays with a length that's a multiple of 32. I'm not sure what CRC
> would do.

Some quick experimenting suggests that you get -1 from an array of 32 of
the same value, then 0 from 64 of the same, etc. This doesn't bother me
particularly though. There are always going to be some situations that
produce degenerate hash values.

> Oh, one question that occurred to me you didn't mention including the
> bounds of the array or the null bitmap in the hash function. I suppose
> it's correct with or without them (or in the case of the null bitmap
> another way to put it is the choice of the hash value to represent
> null is arbitrary).

Yeah. I have it coded to treat a null element as if it had a hash value
of zero. I don't see a great need to consider the array bounds per se.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-10-31 02:47:08 Maximum function call nesting depth for regression tests
Previous Message Greg Stark 2010-10-31 02:12:24 Re: Hash support for arrays