Re: Division in dynahash.c due to HASH_FFACTOR

From: Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Division in dynahash.c due to HASH_FFACTOR
Date: 2020-09-08 11:17:08
Message-ID: VI1PR0701MB69602B97FE699D32ACE12597F6290@VI1PR0701MB6960.eurprd07.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Thomas, Tomas :), Alvaro, hackers,

>> > After removing HASH_FFACTOR PostgreSQL still compiles... Would
>> > removing it break some external API/extensions ?
>>
>> FWIW, HASH_FFACTOR has *never* been used in Postgres core code.
[..]
>> I think if we know that there's a 4% performance increase by removing
>> the division that comes with an ability to use a custom fillfactor that
>> nobody uses or has ever used, it makes sense to remove that ability.

Thomas wrote:
>I think Tomas is right, and the correct fix for this would be to
>compute it up front every time the hash table size changes, but since
>apparently no one ever used it, +1 for just removing it and leaving it
>hard-coded.
[..]
>For what it's worth, for dshash.c I just hard coded it to double the
>hash table size whenever the number of elements in a partition
>exceeded 75%. Popular hash tables in standard libraries seem to use
>either .75 or 1.0 as a default or only setting. There's probably room
>for discussion about numbers in those ranges, (..)

Tomas also asked the same "what if you lower the fill factor, e.g. to 0.8? Does that improve performance or not?" and got some interesting links for avoiding bias. I couldn't find any other simple way to benchmark a real impact of it other than checking DB crash-recovery (if anyone has some artificial workload generator with PinBuffer->hash_search() please let me know).

So I've been testing it using WAL crash-recovery using Thomas's approach [1]: always the same workload to reply, always on the same machine, same env: same 13GB WAL to recover, 8GB db, 24GB shared buffers on top of 128GB RAM, NVMe with fsync off to put dynahash as primary bottleneck (BufTableLookup()+smgropen()), append-only workload, gcc -O3: Results are <1st time is the WAL recovery timing, 2nd time is checkpoint before opening DB>. There were four measurements per each scenario, first one is cut off to avoid noise:

1. DEF_FFACTOR=1 idiv on
103.971, 3.719
103.937, 3.683
104.934, 3.691

2. DEF_FFACTOR=1 idiv off
100.565, 3.71
101.595, 3.807
100.794, 3.691

3. DEF_FFACTOR=1.2 idiv on // some proof that nobody uses it
1599552729.041 postmaster 94510 FATAL: failed to initialize hash table "SERIALIZABLEXID hash"

4. DEF_FFACTOR=0.8 idiv on // another proof ? The reason for below is that ffactor is long and as such cannot handle floats, if it would be float....

Program terminated with signal 8, Arithmetic exception.
#0 0x00000000009a4119 in init_htab (nelem=4, hashp=0x17cd238) at dynahash.c:677
677 nbuckets = next_pow2_int((nelem - 1) / hctl->ffactor + 1);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.180.amzn1.x86_64

#0 0x00000000009a4119 in init_htab (nelem=4, hashp=0x17cd238) at dynahash.c:677
#1 hash_create (tabname=tabname(at)entry=0xb77f9a "Timezones", nelem=nelem(at)entry=4, info=info(at)entry=0x7ffd3055cf30, flags=flags(at)entry=16) at dynahash.c:529
#2 0x00000000009ecddf in init_timezone_hashtable () at pgtz.c:211
#3 pg_tzset (name=name(at)entry=0xb49fd5 "GMT") at pgtz.c:248
#4 0x00000000009ecfee in pg_timezone_initialize () at pgtz.c:372

5. DEF_FFACTOR=1 idiv on // just in case reproducer to validate measurement #1
104.333, 3.804
104.313, 3.772
103.951, 3.687

6. DEF_FFACTOR=2 idiv on
105.406, 3.783
105.33, 3.721
105.403, 3.777

7. DEF_FFACTOR=4 idiv on
105.803, 3.722
105.73, 3.728
106.104, 3.756

Some notes:
- SMgrRelationHash is initialized from start at 400, while DB here is small 8GB and has only couple of tables -> no expansion needed in above test.
- local backend private overflow hash table for buffers: PrivateRefCountHash is initialized at 100 and maybe it grows during
- googling for DEF_FFACTOR I've found only 1 entry with value of <> 1 from 1993 (see this [2] , what's interesting is the same DEF_SEGSIZE value after all those years)

Alvaro wrote:
>> ... however, if we're really tense about improving some hash table's
>> performance, it might be worth trying to use simplehash.h instead of
>> dynahash.c for it.

Thomas wrote:
> Sure, but removing the dynahash IDIV also seems like a slam dunk removal of a useless expensive feature.

I agree with both, I just thought it might be interesting finding as this idiv might be (?) present in other common paths like ReadBuffer*() / PinBuffer() (some recent discussions, maybe on NUMA boxes), not just WAL recovery as it seems relatively easy to improve.

-J.

[1] - https://github.com/macdice/redo-bench
[2] - https://fuhrwerks.com/csrg/info/93c40a660b6cdf74

________________________________
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Tuesday, September 8, 2020 2:55 AM
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>; pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Division in dynahash.c due to HASH_FFACTOR

On Sat, Sep 5, 2020 at 2:34 AM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> On 2020-Sep-04, Jakub Wartak wrote:
> > After removing HASH_FFACTOR PostgreSQL still compiles... Would
> > removing it break some external API/extensions ?
>
> FWIW, HASH_FFACTOR has *never* been used in Postgres core code.
>
> https://postgr.es/m/20160418180711.55ac82c0@fujitsu already reported
> that this flag was unused a few years ago. And a search through
> codesearch.debian.net shows that no software packaged by Debian uses
> that flag either.
>
> *If* any third-party extension is using HASH_FFACTOR, it can easily be
> unbroken by removing the flag or #defining it to zero; the removal would
> only be a problem if anyone showed that there is a performance loss by
> using the default fillfactor for some dynahash table instead of their
> custom value.
>
> I think if we know that there's a 4% performance increase by removing
> the division that comes with an ability to use a custom fillfactor that
> nobody uses or has ever used, it makes sense to remove that ability.

I think Tomas is right, and the correct fix for this would be to
compute it up front every time the hash table size changes, but since
apparently no one ever used it, +1 for just removing it and leaving it
hard-coded.

For what it's worth, for dshash.c I just hard coded it to double the
hash table size whenever the number of elements in a partition
exceeded 75%. Popular hash tables in standard libraries seem to use
either .75 or 1.0 as a default or only setting. There's probably room
for discussion about numbers in those ranges, but I think the concept
of customisable load factors with a range much wider than that may be
more relevant for disk-based hash tables (like our hash indexes),
which have very different economics. I think the name HASH_FFACTOR
may be a clue that this was possibly interference from disk-based hash
tables from the same Berkeley people: rather than "load factor", the
more usual term (?) for nentries/nbuckets in memory-based hash tables
as a way to model number of expected collisions, they called it "fill
factor", which I guess is because in disk-based designs your actually
want a certain rate of collisions, because you're working not with
elements in an array and wanting to avoid collisions while not wasting
space, but rather with fixed sized buckets that you want to fill up,
but not overflow. </archeological-speculation-mode>

> ... however, if we're really tense about improving some hash table's
> performance, it might be worth trying to use simplehash.h instead of
> dynahash.c for it.

Sure, but removing the dynahash IDIV also seems like a slam dunk
removal of a useless expensive feature.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message osumi.takamichi@fujitsu.com 2020-09-08 11:36:10 RE: extension patch of CREATE OR REPLACE TRIGGER
Previous Message Pavel Stehule 2020-09-08 10:57:12 Re: INSERT ON CONFLICT and RETURNING