Re: General purpose hashing func in pgbench

From: Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: General purpose hashing func in pgbench
Date: 2017-12-25 13:24:38
Message-ID: 38f1503e-7b63-4612-24a9-041af25bb022@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Fabien,

24/12/2017 11:12, Fabien COELHO пишет:
>
> Yep. The ugliness is significantly linked to the choice of name. With
> MM2_MUL and MM2_ROT ISTM that it is more readable:
>
>>     k *= MM2_MUL;
>>     k ^= k >> MM2_ROT;
>>     k *= MM2_MUL;
>>     result ^= k;
>>     result *= MM2_MUL;
Ok, will do.
>
>> [...] So I'd better leave it the way it is. Actually I was thinking
>> to do the same to fnv1a too : )
>
> I think that the implementation style should be homogeneous, so I'd
> suggest at least to stick to one style.
>
> I noticed from the source of all human knowledege (aka Wikipedia:-)
> that there seems to be a murmur3 successor. Have you considered it?
> One good reason to skip it would be that the implementation is long
> and complex. I'm not sure about a 8-byte input simplified version.
Murmur2 naturally supports 8-byte data. Murmur3 has 32- and 128-bit
versions. So to handle int64 I could
1) split input value into two halfs and combine somehow the results of
32 bit version or
2) use 128-bit version and discard higher bytes.

Btw, postgres core already has a 32bit murmur3 implementation, but it
only uses the finalization part of algorithm (see murmurhash32). As my
colleague Yura Sokolov told me in a recent conversation it alone
provides pretty good randomization. I haven't tried it yet though.

>
> Just a question: Have you looked at SipHash24?
>
>     https://en.wikipedia.org/wiki/SipHash
>
> The interesting point is that it can use a key and seems somehow
> cryptographically secure, for a similar cost. However the how to
> decide for/control the key is unclear.
>
Not yet. As I can understand from the wiki its main feature is to
prevent attacks with crafted input data. How can it be useful in
benchmarking? Unless it shows superior performance and randomization.

--
Ildar Musin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2017-12-25 14:12:19 Re: General purpose hashing func in pgbench
Previous Message Amit Langote 2017-12-25 09:49:05 Re: Unique indexes & constraints on partitioned tables