From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Combining hash values |
Date: | 2016-07-31 16:34:12 |
Message-ID: | 21850.1469982852@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
> 2. I suspect that this algorithm for combining hashes is weak, and
> could amplify weaknesses in the hash functions feeding it.
Very possibly, but ...
> Compare
> Boost's hash_combine, which does some more work before XORing:
> seed ^= hash_value(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
I can't help being reminded of Knuth's story about he tried to invent
the world's best random number generator, and was disappointed when
it almost immediately converged to a short repeating sequence. If
there's any actual theoretical basis to the above, I'd be interested
to see it. But as-is, the use of addition rather than XOR looks fishy,
and the way the old seed is shifted around looks more likely to result
in cancellation than anything else.
> That constant approximates the golden ratio (as a fraction of the 32
> bit hash space), and it also appears in our hash_any and hash_uint32
> functions.
I think it's just somebody's idea of a magic random number. Your link
> [3] http://cstheory.stackexchange.com/questions/9639/how-did-knuth-derive-a
provides some reasons to think it might be a good choice for a very
specific application, but this is not that application --- in particular,
it doesn't involve multiplying by that constant.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Shay Rojansky | 2016-07-31 16:38:18 | Re: Slowness of extended protocol |
Previous Message | Tom Lane | 2016-07-31 16:18:50 | Re: Hash indexes and effective_cache_size in CREATE INDEX documentation |