RE: Popcount optimization using AVX512

From: "Amonson, Paul D" <paul(dot)d(dot)amonson(at)intel(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Popcount optimization using AVX512
Date: 2024-03-21 19:17:54
Message-ID: BL1PR11MB530478ACABE588A836338E74DC322@BL1PR11MB5304.namprd11.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: David Rowley <dgrowleyml(at)gmail(dot)com>
> Sent: Wednesday, March 20, 2024 5:28 PM
> To: Amonson, Paul D <paul(dot)d(dot)amonson(at)intel(dot)com>
> Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>; Andres Freund
>
> I'm not sure about this "extern negates inline" comment. It seems to me the
> compiler is perfectly free to inline a static function into an external function
> and it's free to inline the static function elsewhere within the same .c file.
>
> The final sentence of the following comment that the 0001 patch removes
> explains this:
>
> /*
> * When the POPCNT instruction is not available, there's no point in using
> * function pointers to vary the implementation between the fast and slow
> * method. We instead just make these actual external functions when
> * TRY_POPCNT_FAST is not defined. The compiler should be able to inline
> * the slow versions here.
> */
>
> Also, have a look at [1]. You'll see f_slow() wasn't even compiled and the code
> was just inlined into f(). I just added the
> __attribute__((noinline)) so that usage() wouldn't just perform constant
> folding and just return 6.
>
> I think, unless you have evidence that some common compiler isn't inlining the
> static into the extern then we shouldn't add the macros.
> It adds quite a bit of churn to the patch and will break out of core code as you
> no longer have functions named pg_popcount32(),
> pg_popcount64() and pg_popcount().

This may be a simple misunderstanding extern != static. If I use the "extern" keyword then a symbol *will* be generated and inline will be ignored. This is NOT true of "static inline", where the compiler will try to inline the method. :)

In this patch set:
* I removed the macro implementation.
* Made everything that could possibly be inlined marked with the "static inline" keyword.
* Conditionally made the *_slow() functions "static inline" when TRY_POPCONT_FAST is not set.
* Found and fixed some whitespace errors in the AVX code implementation.

Thanks,
Paul

Attachment Content-Type Size
v12-0001-Refactor-Split-pg_popcount-functions-into-multiple-f.patch application/octet-stream 16.4 KB
v12-0002-Feature-Added-AVX-512-acceleration-to-the-pg_popcoun.patch application/octet-stream 17.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-03-21 19:26:44 Re: Statistics Import and Export
Previous Message Corey Huinker 2024-03-21 19:10:42 Re: Statistics Import and Export