Re: Change GUC hashtable to use simplehash?

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Ants Aasma <ants(dot)aasma(at)cybertec(dot)at>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gurjeet Singh <gurjeet(at)singh(dot)im>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Change GUC hashtable to use simplehash?
Date: 2024-03-05 10:30:16
Message-ID: CANWCAZbg_XeSeY0a_PqWmWqeRATvzTzUNYRLeT+bzs+YQdC92g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 30, 2024 at 5:04 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>
> On Tue, Jan 30, 2024 at 4:13 AM Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> wrote:
> > But given that we know the data length and we have it in a register
> > already, it's easy enough to just mask out data past the end with a
> > shift. See patch 1. Performance benefit is about 1.5x Measured on a
> > small test harness that just hashes and finalizes an array of strings,
> > with a data dependency between consecutive hashes (next address
> > depends on the previous hash output).
>
> Interesting work! I've taken this idea and (I'm guessing, haven't
> tested) improved it by re-using an intermediate step for the
> conditional, simplifying the creation of the mask, and moving the
> bitscan out of the longest dependency chain.

This needed a rebase, and is now 0001. I plan to push this soon.

I also went and looked at the simplehash instances and found a few
that would be easy to switch over. Rather than try to figure out which
could benefit from shaving cycles, I changed all the string hashes,
and one more, in 0002 so they can act as examples.

0003 uses fasthash for resowner, as suggested by Heikki upthread. Now
murmur64 has no callers, but it (or similar *) could be used in
pg_dump/common.c for hashing CatalogId (8 bytes).

Commit 42a1de3013 added a new use for string_hash, but I can't tell
from a quick glance whether it uses the truncation, so I'm going to
take a closer look before re-attaching the proposed dynahash change
again.

* some examples here:
https://www.boost.org/doc/libs/1_84_0/boost/container_hash/detail/hash_mix.hpp

Attachment Content-Type Size
v19-0001-Speed-up-tail-processing-when-hashing-aligned-C-.patch text/x-patch 3.1 KB
v19-0003-Use-fasthash-for-hash_resource_elem.patch text/x-patch 1.5 KB
v19-0002-Convert-simplehash-hash-functions-on-strings-to-.patch text/x-patch 4.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-03-05 10:50:36 Re: Adding deprecation notices to pgcrypto documentation
Previous Message Bertrand Drouvot 2024-03-05 10:17:03 Re: Fix race condition in InvalidatePossiblyObsoleteSlot()