Re: Use POPCNT on MSVC

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Use POPCNT on MSVC
Date: 2021-08-04 19:01:50
Message-ID: CAFBsxsHfGai31H2_ha2ZAMmGEZbkDgCop8OzdR05AX6CQkWvtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 3, 2021 at 11:36 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Tue, 3 Aug 2021 at 22:43, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
wrote:
> > 1. the __popcnt64() intrinsic is put inside pg_popcount64_asm(), which
is a bit of a misnomer since it's not assembly. Renaming s/_asm/_fast/
would help it look better. But then looking around, other platforms have
intrinsics coded, but for some reason they're put in pg_popcount64_slow(),
where the compiler will emit instructions we could easily write ourselves
in C (and without #ifdefs) since without the right CFLAGS these intrinsics
won't emit a popcnt instruction. Is MSVC different in that regard? If so,
it might be worth a comment.
>
> Yeah, the names no longer fit so well after stuffing the MSVC
> intrinsic into the asm function. The reason I did it that way is down
> to what I read in the docs. Namely:
>
> "If you run code that uses these intrinsics on hardware that doesn't
> support the popcnt instruction, the results are unpredictable."
>
> So, it has to go somewhere where we're sure the CPU supports POPCNT
> and that seemed like the correct place.
>
> From what I understand of GCC and __builtin_popcountl(), the code
> it'll output will depend on the -mpopcnt flag. So having
> __builtin_popcountl() does not mean the popcnt instruction will be
> used. The Microsoft documentation indicates that's not the case for
> __popcnt().

Okay, "unpredictable" sounds bad.

> > 2. (defined(_MSC_VER) && defined(_WIN64) lead to a runtime check for
the CPUID, which is fine, but now next to it HAVE_X86_64_POPCNTQ looks
strange because the latter symbol comes from a specific configure test --
the two don't seem equivalent, but maybe they are because of what MSVC
does? That would be nice to spell out here.
>
> The problem there is that we define HAVE_X86_64_POPCNTQ based on the
> outcome of configure so it does not get set for MSVC. Maybe it's
> worth coming up with some other constant that can be defined or we
> could just do:
>
> #if defined(_MSC_VER) && defined(_WIN64)
> #define HAVE_X86_64_POPCNTQ
> #endif

That seems fine. I don't know PG can build with Arm on Windows, but for the
cpuid to work, it seems safer to also check for __x86_64?

> I think the reason for the asm is that __builtin_popcountl does not
> mean popcnt will be used. Maybe we could have done something like
> compile pg_bitutils.c with -mpopcnt, and have kept the run-time check,
> but the problem there is that means that the compiler might end up
> using that instruction in some other function in that file that we
> don't want it to. It looks like my patch in [1] did pass the -mpopcnt
> flag which Tom fixed.

Ah, that makes sense. (If we someday offer a configure option for
x86-64-v2, we can use intrinsics in the asm functions, and call them
directly. Yet another different thread.)

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2021-08-04 19:09:30 Re: Lowering the ever-growing heap->pd_lower
Previous Message Andres Freund 2021-08-04 19:01:10 Re: A varint implementation for PG?