Re: Using POPCNT and other advanced bit manipulation instructions

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Subject: Re: Using POPCNT and other advanced bit manipulation instructions
Date: 2018-12-20 10:57:14
Message-ID: CA+q6zcUCDFANm80O2aHAZVjNASizOEuLbMPijYAuf_KPQ1t01A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Thu, Dec 20, 2018 at 6:53 AM David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>
> Thomas mentions in [1], to get the GCC to use the POPCNT instruction,
> we must pass -mpopcnt in the build flags. After doing a bit of
> research, I found [2] which mentions that some compilers have some
> pattern matching code to allow the popcnt instruction to be used even
> without a macro such as __builtin_popcount(). I believe I've
> correctly written the run-time test to skip using the new popcnt
> function, but if there's any code around that might cause the compiler
> to use the popcnt instruction from pattern matching, then that might
> cause problems.

I've checked for Clang 6, it turns out that indeed it generates popcnt without
any macro, but only in one place for bloom_prop_bits_set. After looking at this
function it seems that it would be benefitial to actually use popcnt there too.

> I am able to measure performance gains from the patch. In a 3.4GB
> table containing a single column with just 10 statistics targets, I
> got the following times after running ANALYZE on the table.

I've tested it too a bit, and got similar results when the patched version is
slightly faster. But then I wonder if popcnt is the best solution here, since
after some short research I found a paper [1], where authors claim that:

Maybe surprisingly, we show that a vectorized approach using SIMD
instructions can be twice as fast as using the dedicated instructions on
recent Intel processors.

[1]: https://arxiv.org/pdf/1611.07612.pdf

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jose Luis Tallon 2018-12-20 10:58:43 Re: Using POPCNT and other advanced bit manipulation instructions
Previous Message Christoph Berg 2018-12-20 10:42:32 [patch] de.po REINDEX error