| From: | Ildar Musin <i(dot)musin(at)postgrespro(dot)ru> | 
|---|---|
| To: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> | 
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: General purpose hashing func in pgbench | 
| Date: | 2018-01-25 09:42:25 | 
| Message-ID: | ae0c93b1-f8a8-a9aa-244c-0c872fa789f0@postgrespro.ru | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hello Fabien,
On 18.01.2018 12:06, Fabien COELHO wrote:
>> My intention was to introduce seed variable which potentially could be
>> used in different contexts, not only for hash functions.
>
> Yes, good point. I'd need it for the pseudo-random permutation function.
>
>> I renamed it to 'hash_seed' for now. But what do you think about
>> naming it simply 'seed' or choosing some other general name?
>
> ISTM "seed" that is prone to being used for something else in a script.
> What about ":default_seed", which says it all?
>
Sounds good, renamed to "default_seed".
>
> Some minor suggestions:
>
> In "make_func", I'd assert that nargs is positive in the default branch.
Added assert for non-negative nargs (since there is pi() function with 
zero arguments).
>
> The default seed may be created with just one assignment instead of
> repeated |= ops. Or not:-)
>
Tried this, but after applying pgindent it turns into a mess : ) So I 
left it in the initial form.
> In evalStandardFunc, I'd suggest to avoid the ? construction and use an
> if and a direct setIntValue in both case, removing the "result"
> variable, as done in other branches (eg bitwise operators...). Maybe
> something like:
>
>   if (func == murmur2)
>     setIntValue(retval, getHashXXX(val, seed));
>   else if (...)
>     ...
>   else
>     Assert(0);
>
> so that it is structurally ready for other hash functions if needed.
>
Done. Probably if more functions are added it would make sense to change 
it to "switch".
> Documentation:
>
> In the table, use official names in the description, and add wikipedia
> links, something like:
>
> FNV hash ->
>   <ulink
> url="https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function">FNV-1a
> hash</ulink>
>
> murmur2 hash ->
>   <ulink url="https://en.wikipedia.org/wiki/MurmurHash">MurmurHash2
> hash</ulink>
>
>
> In the text:
>
> "Hash functions accepts" -> "Hash functions <literal>hash</literal>,
> <......> and <....> accept*NO S*"
>
> "... provided hash_seed value is used" => "... provided the value of
> <literal>:hash_seed</literal> is used, which is initialized randomly
> unless set by the command-line <literal>-D</literal> option."
>
> ISTM that the Automatic Variable table should be in alphabetical order.
>
Updated the documentation. Thanks!
-- 
Ildar Musin
i(dot)musin(at)postgrespro(dot)ru
| Attachment | Content-Type | Size | 
|---|---|---|
| pgbench_hash_v9.patch | text/x-patch | 12.8 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Langote | 2018-01-25 09:44:48 | Re: list partition constraint shape | 
| Previous Message | Etsuro Fujita | 2018-01-25 09:30:31 | Re: non-bulk inserts and tuple routing |