Re: Popcount optimization using AVX512

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: "Amonson, Paul D" <paul(dot)d(dot)amonson(at)intel(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Popcount optimization using AVX512
Date: 2024-03-30 03:22:09
Message-ID: 20240330032209.GA2018686@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here's a v17 of the patch. This one has configure checks for everything
(i.e., CPUID, XGETBV, and the AVX512 intrinsics) as well as the relevant
runtime checks (i.e., we call CPUID to check for XGETBV and AVX512 POPCNT
availability, and we call XGETBV to ensure the ZMM registers are enabled).
I restricted the AVX512 configure checks to x86_64 since we know we won't
have TRY_POPCNT_FAST on 32-bit, and we rely on pg_popcount_fast() as our
fallback implementation in the AVX512 version. Finally, I removed the
inline assembly in favor of using the _xgetbv() intrinsic on all systems.
It looks like that's available on gcc, clang, and msvc, although it
sometimes requires -mxsave, so that's applied to
pg_popcount_avx512_choose.o as needed. I doubt this will lead to SIGILLs,
but it's admittedly a little shaky.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v17-0001-AVX512-popcount-support.patch text/x-diff 28.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-03-30 03:38:11 Re: broken JIT support on Fedora 40
Previous Message Jeff Davis 2024-03-30 01:02:40 Re: Statistics Import and Export