Re: General purpose hashing func in pgbench

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: General purpose hashing func in pgbench
Date: 2018-03-06 12:53:49
Message-ID: 1ad42902-ef1f-a715-7010-1288fb8aae89@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> Patch applies, compiles, pgbench & global "make check" ok, doc built ok.

Agree.

If I understand upthread correctly, implementation of Murmur hash algorithm
based on Austin Appleby work
https://github.com/aappleby/smhasher/blob/master/src/MurmurHash2.cpp

If so, I have notice and objections:

1) Seems, it's good idea to add credits to Austin Appleby to comments.

2) Reference implementaion directly says (link above):
// 2. It will not produce the same results on little-endian and big-endian
// machines.

I don't think that is good thing for testing and benchmarking for several
reasons: it could produce different data collection, different selects,
different distribution.

3) Again, from comments of reference implementation:
// Note - This code makes a few assumptions about how your machine behaves -
// 1. We can read a 4-byte value from any address without crashing

It's not true for all supported platforms. Any box with strict aligment will
SIGBUSed here.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Darafei Komяpa Praliaskouski 2018-03-06 13:07:45 Re: All Taxi Services need Index Clustered Heap Append
Previous Message Claudio Freire 2018-03-06 12:52:16 Re: Faster inserts with mostly-monotonically increasing values