From: | "Amonson, Paul D" <paul(dot)d(dot)amonson(at)intel(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, "Andres Freund" <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: Popcount optimization using AVX512 |
Date: | 2024-03-28 22:03:04 |
Message-ID: | BL1PR11MB5304E51336123CE6F041A920DC3B2@BL1PR11MB5304.namprd11.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> -----Original Message-----
> From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
> Sent: Thursday, March 28, 2024 2:39 PM
> To: Amonson, Paul D <paul(dot)d(dot)amonson(at)intel(dot)com>
>
> * The latest patch set from Paul Amonson appeared to support MSVC in the
> meson build, but not the autoconf one. I don't have much expertise here,
> so the v14 patch doesn't have any autoconf/meson support for MSVC, which
> I thought might be okay for now. IIUC we assume that 64-bit/MSVC builds
> can always compile the x86_64 popcount code, but I don't know whether
> that's safe for AVX512.
I also do not know how to integrate MSVC+Autoconf, the CI uses MSVC+Meson+Ninja so I stuck with that.
> * I think we need to verify there isn't a huge performance regression for
> smaller arrays. IIUC those will still require an AVX512 instruction or
> two as well as a function call, which might add some noticeable overhead.
Not considering your changes, I had already tested small buffers. At less than 512 bytes there was no measurable regression (there was one extra condition check) and for 512+ bytes it moved from no regression to some gains between 512 and 4096 bytes. Assuming you introduced no extra function calls, it should be the same.
> I forgot to mention that I also want to understand whether we can actually assume availability of XGETBV when CPUID says we support AVX512:
You cannot assume as there are edge cases where AVX-512 was found on system one during compile but it's not actually available in a kernel on a second system at runtime despite the CPU actually having the hardware feature.
I will review the new patch to see if there are anything that jumps out at me.
Thanks,
Paul
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2024-03-28 22:04:33 | Re: incorrect results and different plan with 2 very similar queries |
Previous Message | Nathan Bossart | 2024-03-28 21:51:36 | Re: Popcount optimization using AVX512 |