Re: Use simplehash.h instead of dynahash in SMgr

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Use simplehash.h instead of dynahash in SMgr
Date: 2021-04-26 19:44:13
Message-ID: d051e4b8f125f837ffff1d798e6268d7@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund писал 2021-04-26 21:46:
> Hi,
>
> On 2021-04-25 01:27:24 +0300, Yura Sokolov wrote:
>> It is quite interesting result. Simplehash being open-addressing with
>> linear probing is friendly for cpu cache. I'd recommend to define
>> SH_FILLFACTOR with value lower than default (0.9). I believe 0.75 is
>> suitable most for such kind of hash table.
>
> It's not a "plain" linear probing hash table (although it is on the
> lookup
> side). During insertions collisions are reordered so that the average
> distance
> from the "optimal" position is ~even ("robin hood hashing"). That
> allows a
> higher load factor than a plain linear probed hash table (for which
> IIRC
> there's data to show 0.75 to be a good default load factor).

Even for Robin Hood hashing 0.9 fill factor is too high. It leads to too
much movements on insertion/deletion and longer average collision chain.

Note that Robin Hood doesn't optimize average case. Indeed, usually
Robin Hood
has worse (longer) average collision chain than simple linear probing.
Robin Hood hashing optimizes worst case, ie it guarantees there is no
unnecessary
long collision chain.

See discussion on Rust hash table fill factor when it were Robin Hood:
https://github.com/rust-lang/rust/issues/38003

>
> There of course may still be a benefit in lowering the load factor, but
> I'd
> not start there.
>
> David's test aren't really suited to benchmarking the load factor, but
> to me
> the stats he showed didn't highlight a need to lower the load factor.
> Lowering
> the fill factor does influence the cache hit ratio...
>
> Greetings,
>
> Andres Freund

regards,
Yura.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-04-26 19:44:46 Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY
Previous Message Andrew Dunstan 2021-04-26 19:31:09 multi-version capable PostgresNode.pm