|From:||Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>|
|To:||Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>|
|Subject:||Re: General purpose hashing func in pgbench|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
20/12/2017 10:36, Fabien COELHO пишет:
> As there may be several hash functions included in the long run. I'd
> suggest that the hash function should be named more precisely, eg
Done. Added "hash_murmur2" too, see below.
> The image looks like the distribution is more regularly scattered than
> actually randomized... Maybe this is because the first highest 256
> values are really scattered by the process multiply/modulo process. Or
> maybe this is an optical effect?
After your comment I searched the internet for different hashing
algorithms comparison wrt randomness and found an interesting post at
stackexchange . According to author's research the murmur2 algorithm
has the best randomness rate (among those he tested). So I implemented
it (using original code by Austin Appleby as a reference ) and
conducted few experiments. Results are in attachement. Indeed, comparing
to murmur2 the FNV distribution seems pretty regular.
> ISTM that there are undesired utf8 chars in a comment. Should be kept
Oops, I copy-pasted the algorithm name from wikipedia, didn't notice
there were some fancy unicode hyphens.
> I would put the actual hash computation in a separate function rather
> than inlined in the evaluator.
> Add the submission to the next CF?
I think it is not commitfest ready yet -- I need to add some
documentation and tests first.
Russian Postgres Company
|Next Message||Craig Ringer||2017-12-21 11:42:09||Re: The pg_indent on on ftp is outdated|
|Previous Message||David Rowley||2017-12-21 11:17:50||Re: [HACKERS] path toward faster partition pruning|