Re: Use compiler intrinsics for bit ops in hash

From: Jesse Zhang <sbjesse(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use compiler intrinsics for bit ops in hash
Date: 2020-03-02 20:45:21
Message-ID: CAGf+fX4xSXJXNc2s4GfyeHH1Kxm5OuZ6DsXbTL2Knsg09rGkoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi David,

On Wed, Feb 26, 2020 at 9:56 PM David Fetter <david(at)fetter(dot)org> wrote:
>
> On Wed, Feb 26, 2020 at 09:12:24AM +0100, David Fetter wrote:
> > On Fri, Jan 31, 2020 at 04:59:18PM +0100, David Fetter wrote:
> > > On Wed, Jan 15, 2020 at 03:45:12PM -0800, Jesse Zhang wrote:
> > > > On Tue, Jan 14, 2020 at 2:09 PM David Fetter <david(at)fetter(dot)org> wrote:
> > > > > > The changes in hash AM and SIMPLEHASH do look like a net positive
> > > > > > improvement. My biggest cringe might be in pg_bitutils:
> > > > > >
> > > > > > 1. Is ceil_log2_64 dead code?
> > > > >
> > > > > Let's call it nascent code. I suspect there are places it could go, if
> > > > > I look for them. Also, it seemed silly to have one without the other.
> > > > >
> > > >
> > > > While not absolutely required, I'd like us to find at least one
> > > > place and start using it. (Clang also nags at me when we have
> > > > unused functions).
> > >
> > > Done in the expanded patches attached.
I see that you've found use of it in dynahash, thanks!

The math in the new (from v4 to v6) patch is wrong: it yields
ceil_log2(1) = 1 or next_power_of_2(1) = 2. I can see that you lifted
the restriction of "num greater than one" for ceil_log2() in this patch
set, but it's now _more_ problematic to base those functions on
pg_leftmost_one_pos().

I'm not comfortable with your changes to pg_leftmost_one_pos() to remove
the restriction on word being non-zero. Specifically
pg_leftmost_one_pos() is made to return 0 on 0 input. While none of its
current callers (in HEAD) is harmed, this introduces muddy semantics:

1. pg_leftmost_one_pos is semantically undefined on 0 input: scanning
for a set bit in a zero word won't find it anywhere.

2. we can _try_ generalizing it to accommodate ceil_log2 by
extrapolating based on the invariant that BSR + LZCNT = 31 (or 63). In
that case, the extrapolation yields -1 for pg_leftmost_one_pos(0).

I'm not convinced that others on the list will be comfortable with the
generalization suggested in 2 above.

I've quickly put together a PoC patch on top of yours, which
re-implements ceil_log2 using LZCNT coupled with a CPUID check.
Thoughts?

Cheers,
Jesse

Attachment Content-Type Size
0001-Use-LZCOUNT-when-possible.patch text/x-patch 6.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2020-03-02 21:26:32 Re: [PATCH] kNN for btree
Previous Message Andy Fan 2020-03-02 20:25:51 Re: [PATCH] Erase the distinctClause if the result is unique by definition