Re: Use compiler intrinsics for bit ops in hash

From: David Fetter <david(at)fetter(dot)org>
To: Jesse Zhang <sbjesse(at)gmail(dot)com>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Subject: Re: Use compiler intrinsics for bit ops in hash
Date: 2020-03-08 18:34:06
Message-ID: 20200308183405.GZ13804@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 02, 2020 at 12:45:21PM -0800, Jesse Zhang wrote:
> Hi David,
>
> On Wed, Feb 26, 2020 at 9:56 PM David Fetter <david(at)fetter(dot)org> wrote:
> >
> > On Wed, Feb 26, 2020 at 09:12:24AM +0100, David Fetter wrote:
> > > On Fri, Jan 31, 2020 at 04:59:18PM +0100, David Fetter wrote:
> > > > On Wed, Jan 15, 2020 at 03:45:12PM -0800, Jesse Zhang wrote:
> > > > > On Tue, Jan 14, 2020 at 2:09 PM David Fetter <david(at)fetter(dot)org> wrote:
> > > > > > > The changes in hash AM and SIMPLEHASH do look like a net positive
> > > > > > > improvement. My biggest cringe might be in pg_bitutils:
> > > > > > >
> > > > > > > 1. Is ceil_log2_64 dead code?
> > > > > >
> > > > > > Let's call it nascent code. I suspect there are places it could go, if
> > > > > > I look for them. Also, it seemed silly to have one without the other.
> > > > > >
> > > > >
> > > > > While not absolutely required, I'd like us to find at least one
> > > > > place and start using it. (Clang also nags at me when we have
> > > > > unused functions).
> > > >
> > > > Done in the expanded patches attached.
> I see that you've found use of it in dynahash, thanks!
>
> The math in the new (from v4 to v6) patch is wrong: it yields
> ceil_log2(1) = 1 or next_power_of_2(1) = 2. I can see that you lifted
> the restriction of "num greater than one" for ceil_log2() in this patch
> set, but it's now _more_ problematic to base those functions on
> pg_leftmost_one_pos().
>
> I'm not comfortable with your changes to pg_leftmost_one_pos() to remove
> the restriction on word being non-zero. Specifically
> pg_leftmost_one_pos() is made to return 0 on 0 input. While none of its
> current callers (in HEAD) is harmed, this introduces muddy semantics:
>
> 1. pg_leftmost_one_pos is semantically undefined on 0 input: scanning
> for a set bit in a zero word won't find it anywhere.
>
> 2. we can _try_ generalizing it to accommodate ceil_log2 by
> extrapolating based on the invariant that BSR + LZCNT = 31 (or 63). In
> that case, the extrapolation yields -1 for pg_leftmost_one_pos(0).
>
> I'm not convinced that others on the list will be comfortable with the
> generalization suggested in 2 above.
>
> I've quickly put together a PoC patch on top of yours, which
> re-implements ceil_log2 using LZCNT coupled with a CPUID check.
> Thoughts?

Per discussion on IRC with Andrew (RhodiumToad) Gierth:

The runtime detection means there's always an indirect call overhead
and no way to inline. This is counter to what using compiler
intrinsics is supposed to do.

It's better to rely on the compiler, because:
(a) The compiler often knows whether the value can or can't be 0 and
can therefore skip a conditional jump.
(b) If you're targeting a recent microarchitecture, the compiler can
just use the right instruction.
(c) Even if the conditional branch is left in, it's not a big overhead.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-03-08 18:37:49 Re: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed at end-of-transaction
Previous Message Tom Lane 2020-03-08 18:28:57 Re: Remove utils/acl.h from catalog/objectaddress.h