| From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> | 
|---|---|
| To: | "Amonson, Paul D" <paul(dot)d(dot)amonson(at)intel(dot)com> | 
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | Re: Popcount optimization using AVX512 | 
| Date: | 2024-03-28 21:38:54 | 
| Message-ID: | 20240328213854.GA916732@nathanxps13 | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Here is a v14 of the patch that I think is beginning to approach something
committable.  Besides general review and testing, there are two things that
I'd like to bring up:
* The latest patch set from Paul Amonson appeared to support MSVC in the
  meson build, but not the autoconf one.  I don't have much expertise here,
  so the v14 patch doesn't have any autoconf/meson support for MSVC, which
  I thought might be okay for now.  IIUC we assume that 64-bit/MSVC builds
  can always compile the x86_64 popcount code, but I don't know whether
  that's safe for AVX512.
* I think we need to verify there isn't a huge performance regression for
  smaller arrays.  IIUC those will still require an AVX512 instruction or
  two as well as a function call, which might add some noticeable overhead.
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
| Attachment | Content-Type | Size | 
|---|---|---|
| v14-0001-AVX512-popcount-support.patch | text/x-diff | 22.7 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2024-03-28 21:43:42 | Re: BitmapHeapScan streaming read user and prelim refactoring | 
| Previous Message | Thomas Munro | 2024-03-28 21:19:08 | Re: BitmapHeapScan streaming read user and prelim refactoring |