Re: General purpose hashing func in pgbench

From: Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: General purpose hashing func in pgbench
Date: 2018-01-25 09:42:25
Message-ID: ae0c93b1-f8a8-a9aa-244c-0c872fa789f0@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Fabien,

On 18.01.2018 12:06, Fabien COELHO wrote:

>> My intention was to introduce seed variable which potentially could be
>> used in different contexts, not only for hash functions.
>
> Yes, good point. I'd need it for the pseudo-random permutation function.
>
>> I renamed it to 'hash_seed' for now. But what do you think about
>> naming it simply 'seed' or choosing some other general name?
>
> ISTM "seed" that is prone to being used for something else in a script.
> What about ":default_seed", which says it all?
>

Sounds good, renamed to "default_seed".

>
> Some minor suggestions:
>
> In "make_func", I'd assert that nargs is positive in the default branch.

Added assert for non-negative nargs (since there is pi() function with
zero arguments).

>
> The default seed may be created with just one assignment instead of
> repeated |= ops. Or not:-)
>

Tried this, but after applying pgindent it turns into a mess : ) So I
left it in the initial form.

> In evalStandardFunc, I'd suggest to avoid the ? construction and use an
> if and a direct setIntValue in both case, removing the "result"
> variable, as done in other branches (eg bitwise operators...). Maybe
> something like:
>
> if (func == murmur2)
> setIntValue(retval, getHashXXX(val, seed));
> else if (...)
> ...
> else
> Assert(0);
>
> so that it is structurally ready for other hash functions if needed.
>

Done. Probably if more functions are added it would make sense to change
it to "switch".

> Documentation:
>
> In the table, use official names in the description, and add wikipedia
> links, something like:
>
> FNV hash ->
> <ulink
> url="https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function">FNV-1a
> hash</ulink>
>
> murmur2 hash ->
> <ulink url="https://en.wikipedia.org/wiki/MurmurHash">MurmurHash2
> hash</ulink>
>
>
> In the text:
>
> "Hash functions accepts" -> "Hash functions <literal>hash</literal>,
> <......> and <....> accept*NO S*"
>
> "... provided hash_seed value is used" => "... provided the value of
> <literal>:hash_seed</literal> is used, which is initialized randomly
> unless set by the command-line <literal>-D</literal> option."
>
> ISTM that the Automatic Variable table should be in alphabetical order.
>

Updated the documentation. Thanks!

--
Ildar Musin
i(dot)musin(at)postgrespro(dot)ru

Attachment Content-Type Size
pgbench_hash_v9.patch text/x-patch 12.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-01-25 09:44:48 Re: list partition constraint shape
Previous Message Etsuro Fujita 2018-01-25 09:30:31 Re: non-bulk inserts and tuple routing