Re: General purpose hashing func in pgbench

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: General purpose hashing func in pgbench
Date: 2017-12-25 16:17:33
Message-ID: alpine.DEB.2.20.1712251702170.22976@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello,

>> However, the key can be used if controlled so that different values do
>> not have the same randomization in different part of the script, so as
>> to avoid using the same patterns for different things if not desirable.
>
> Original murmur algorithm accepts seed as a parameter, which can be used
> for this purpose. I used value itself as a seed in the patch, but I can
> turn it into a function argument.

Hmmm. Possibly. However it needs a default, something like

x = hash(v[, key])

with key some pseudo-random value?

>> For the pgbench pseudo-random permutation I'm looking at, the key can
>> be specified to control whether the same pattern or not should be used.
>>
> Just curious, which algorithm are you intended to choose?

I have found nothing convincing, so I'm inventing.

Most "permutation" functions are really cryptographic cyphers which are
quite expensive, and require powers of two, which is not what is needed.
ISTM that there are some constructs to deal with arbitrary sizes based on
cryptographic functions, but that would make it too expensive for the
purpose.

Currently I use the key as a seed for a cheap a prng, two rounds of
overlapping (to deal with non power of two) xor (from the prng output) and
prime-multiply-modulo to scatter the value. See attached work in progress.

If you have a pointer for a good and cheap generic (any size) keyed
permutation, feel free to share it.

--
Fabien.

Attachment Content-Type Size
test_hash_2.c text/x-csrc 6.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2017-12-25 17:40:56 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Petr Jelinek 2017-12-25 16:10:22 Re: [HACKERS] Replication status in logical replication