Re: Change GUC hashtable to use simplehash?

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gurjeet Singh <gurjeet(at)singh(dot)im>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Change GUC hashtable to use simplehash?
Date: 2024-02-06 07:59:52
Message-ID: CANWCAZbKvnbKwiMEEruknzrXvG+H-y49ej30E9VZAdpgHWWRyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
>
> It occurred to me that it's strange to have two places that length can
> be passed. That was a side effect of the original, which used length
> to both know how many bytes to read, and to modify the internal seed.
> With the incremental API, it doesn't make sense to pass the length (or
> a dummy macro) up front -- with a compile-time fixed length, it can't
> possibly break a tie, so it's just noise.

This was a wart, so pushed removing initial length from the incremental API.

On Mon, Jan 22, 2024 at 11:16 AM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>
> On Mon, 2024-01-22 at 09:03 +0700, John Naylor wrote:
> > v15-0004 is a stab at that. As an idea, it also renames zero_bytes_le
> > to zero_byte_low to reflect the effect better. There might be some
> > other comment edits needed to explain usage, so I plan to hold on to
> > this for later. Let me know what you think.
>
> 0004 looks good to me. No urgency so feel free to hold it until a
> convenient time.

Thanks for looking, I pushed this along with an expanded explanation of usage.

> 0002 and 0003 use fasthash for dynahash and GUC hash, respectively.
> These cannot use the existing cstring hashing directly because of
> truncation and case-folding, respectively. (Some simplehash uses can,
> but that can come later)

I've re-attached these as well as a cleaned-up version of the tail
optimization. For the CF entry, the GUC hash function in this form
might only be necessary if we went ahead with simple hash. We don't
yet have a new benchmark to show if that's still worthwhile after
867dd2dc87 improved the one upthread.

For dynahash, one tricky part seems to be the comment about the
default and when it was an assertion error. I've tried to reword this,
but maybe needs work. When that's in shape, I'll incorporate removing
other strlen calls.

Attachment Content-Type Size
v18-0001-Use-fasthash-for-dynahash-s-default-string-hash.patch text/x-patch 4.2 KB
v18-0002-Use-fasthash-for-guc_name_hash.patch text/x-patch 2.0 KB
v18-0003-Speed-up-tail-processing-when-hashing-aligned-C-.patch text/x-patch 3.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-02-06 08:19:56 Re: 2024-02-08 release announcement draft
Previous Message Kyotaro Horiguchi 2024-02-06 07:58:16 Re: InstallXLogFileSegment() vs concurrent WAL flush