Re: Change GUC hashtable to use simplehash?

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Ants Aasma <ants(dot)aasma(at)cybertec(dot)at>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gurjeet Singh <gurjeet(at)singh(dot)im>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Change GUC hashtable to use simplehash?
Date: 2024-04-07 01:40:15
Message-ID: CANWCAZYLvWOm-1XLT8wGoXGgNBEV_NPMWJY4gM6cXa7LHVDjtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 30, 2024 at 4:13 AM Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> wrote:
>
> But given that we know the data length and we have it in a register
> already, it's easy enough to just mask out data past the end with a
> shift. See patch 1. Performance benefit is about 1.5x Measured on a
> small test harness that just hashes and finalizes an array of strings,
> with a data dependency between consecutive hashes (next address
> depends on the previous hash output).

I pushed this with a couple cosmetic adjustments, after fixing the
endianness issue. I'm not sure why valgrind is fine with this way, and
the other ways I tried forming the (little-endian) mask raised errors.
In addition to "zero_byte_low | (zero_byte_low - 1)", I tried
"~zero_byte_low & (zero_byte_low - 1)" and "zero_byte_low ^
(zero_byte_low - 1)" to no avail.

On Thu, Mar 28, 2024 at 12:37 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> 0001 looks good to me, thank you.
>
> > v21-0003 adds a new file hashfn_unstable.c for convenience functions
> > and converts all the duplicate frontend uses of hash_string_pointer.
>
> Why not make hash_string() inline, too? I'm fine with it either way,
> I'm just curious why you went to the trouble to create a new .c file so
> it didn't have to be inlined.

Thanks for looking! I pushed these, with hash_string() inlined.

I've attached (not reindented for clarity) an update of something
mentioned a few times already -- removing strlen calls for dynahash
and dshash string keys. I'm not quite sure how the comments should be
updated about string_hash being deprecated to call directly. This
patch goes further and semi-deprecates calling it at all, so these
comments seem a bit awkward now.

Attachment Content-Type Size
v21-0001-Use-fasthash-for-string-keys-in-dynahash-and-dsh.patch text/x-patch 6.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2024-04-07 02:08:40 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Andres Freund 2024-04-07 01:39:17 Re: Flushing large data immediately in pqcomm