Re: Using POPCNT and other advanced bit manipulation instructions

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: andres(at)anarazel(dot)de, alvherre(at)2ndquadrant(dot)com, thomas(dot)munro(at)enterprisedb(dot)com, andrew(at)tao11(dot)riddles(dot)org(dot)uk, david(dot)rowley(at)2ndquadrant(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Using POPCNT and other advanced bit manipulation instructions
Date: 2019-02-15 01:46:01
Message-ID: 20190215.104601.261252390.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 14 Feb 2019 16:45:38 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in <822(dot)1550180738(at)sss(dot)pgh(dot)pa(dot)us>
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2019-02-14 15:47:13 -0300, Alvaro Herrera wrote:
> >> Hah, I just realized you have to add -mlzcnt in order for these builtins
> >> to use the lzcnt instructions. It goes from something like
> >>
> >> bsrq %rax, %rax
> >> xorq $63, %rax
>
> > I'm confused how this is a general count leading zero operation? Did you
> > use constants or something that allowed ot infer a range in the test? If
> > so the compiler probably did some optimizations allowing it to do the
> > above.
>
> No. If you compile
>
> int myclz(unsigned long long x)
> {
> return __builtin_clzll(x);
> }
>
> at -O2, on just about any x86_64 gcc, you will get
>
> myclz:
> .LFB1:
> .cfi_startproc
> bsrq %rdi, %rax
> xorq $63, %rax
> ret
> .cfi_endproc
>

I understand that the behavior of __builtin_c[tl]z(0) is
undefined from the reason, they convert to bs[rf]. So if we use
these builtins, additional check is required.

https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

> Built-in Function: int __builtin_clz (unsigned int x)
> Returns the number of leading 0-bits in x, starting at the most
> significant bit position. If x is 0, the result is undefined.
>
> Built-in Function: int __builtin_ctz (unsigned int x)
> Returns the number of trailing 0-bits in x, starting at the
> least significant bit position. If x is 0, the result is
> undefined.

And even worse lzcntx is accidentially "fallback"s to bsrx on
unsupported CPUs, which leads to bogus results.
__builtin_clzll(8) = 3, which should be 60.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-02-15 02:08:41 Re: pg11.1: dsa_area could not attach to segment
Previous Message Andreas Karlsson 2019-02-15 01:41:13 Re: proposal: pg_restore --convert-to-text