Re: Using POPCNT and other advanced bit manipulation instructions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Using POPCNT and other advanced bit manipulation instructions
Date: 2019-02-14 22:57:54
Message-ID: 4022.1550185074@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> That leads me to the attached patch. It creates a new file
> pg_popcount.c which is the only one compiled with -mpopcnt (if
> available); if there's no compiler switch to enable POPCNT, we just
> don't compile the file. I'm not sure that's kosher -- in particular I'm
> not sure if it can fail when POPCNT is enabled by other flags and
> -mpopcnt is not needed at all. I think our c-compiler.m4 stuff is a bit
> too simplistic there: it just assumes that -mpopcnt is always required.

Yes, the configure test for this stuff is really pretty broken.
It's conflating two nearly independent questions: (1) does the compiler
have __builtin_popcount(), and (2) does the compiler accept -mpopcnt.
It is certainly the case that (1) may hold without (2); in fact, every
recent non-x86_64 gcc is a counterexample to how that's done in HEAD.

I think we need a clean test for __builtin_popcount(), and to be willing
to use it if available, independently of -mpopcnt. Then separately we
should test to see if -mpopcnt works, probably with the same
infrastructure we use for checking for other compiler flags, viz

# Optimization flags for specific files that benefit from vectorization
PGAC_PROG_CC_VAR_OPT(CFLAGS_VECTOR, [-funroll-loops])
PGAC_PROG_CC_VAR_OPT(CFLAGS_VECTOR, [-ftree-vectorize])
+ # Optimization flags for bit-twiddling
+ PGAC_PROG_CC_VAR_OPT(CFLAGS_POPCNT, [-mpopcnt])
# We want to suppress clang's unhelpful unused-command-line-argument warnings

Then the correct test to see if we want to build pg_popcount.c (BTW,
please pick a less generic name for that) and the choose function
is whether we have *both* HAVE__BUILTIN_POPCOUNT and nonempty
CFLAGS_POPCNT.

I don't think this'd be fooled by user-specified CFLAGS. The worst
possible outcome is that it builds a function that we intended would
use POPCNT but it's falling back to some other implementation, in
case the compiler has a switch named -mpopcnt but it doesn't do what
we think it does, or the user overrode things by adding -mno-popcnt.
That would really be nearly cost-free, other than the overhead of
the choose function the first time through: both of the execution
functions would be reducing to __builtin_popcount(), for whatever
version of that the compiler is giving us, so the choice wouldn't
matter.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-02-14 23:06:16 Re: [PATCH] xlogreader: do not read a file block twice
Previous Message Alvaro Herrera 2019-02-14 22:43:35 Re: Using POPCNT and other advanced bit manipulation instructions