RE: Popcount optimization using AVX512

From: "Amonson, Paul D" <paul(dot)d(dot)amonson(at)intel(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Matthias van de Meent" <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Popcount optimization using AVX512
Date: 2024-03-15 15:31:17
Message-ID: BL1PR11MB530473FB4E9CBD68C28896F7DC282@BL1PR11MB5304.namprd11.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
> Sent: Friday, March 15, 2024 8:06 AM
> To: Amonson, Paul D <paul(dot)d(dot)amonson(at)intel(dot)com>
> Cc: Andres Freund <andres(at)anarazel(dot)de>; Alvaro Herrera <alvherre(at)alvh(dot)no-
> ip.org>; Shankaran, Akash <akash(dot)shankaran(at)intel(dot)com>; Noah Misch
> <noah(at)leadboat(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Matthias van de
> Meent <boekewurm+postgres(at)gmail(dot)com>; pgsql-
> hackers(at)lists(dot)postgresql(dot)org
> Subject: Re: Popcount optimization using AVX512
>
> Which test suite did you run? Those numbers seem potentially
> indistinguishable from noise, which probably isn't great for such a large patch
> set.

I ran...
psql -c "select bitcount(column) from table;"
...in a loop with "column" widths of 84, 4096, 8192, and 16384 containing random data. There DB has 1 million rows. In the loop before calling the select I have code to clear all system caches. If I omit the code to clear system caches the margin of error remains the same but the improvement percent changes from 1.2% to 14.6% (much less I/O when cached data is available).

> I ran John Naylor's test_popcount module [0] with the following command on
> an i7-1195G7:
>
> time psql postgres -c 'select drive_popcount(10000000, 1024)'
>
> Without your patches, this seems to take somewhere around 8.8 seconds.
> With your patches, it takes 0.6 seconds. (I re-compiled and re-ran the tests a
> couple of times because I had a difficult time believing the amount of
> improvement.)

When I tested the code outside postgres in a micro benchmark I got 200-300% improvements. Your results are interesting, as it implies more than 300% improvement. Let me do some research on the benchmark you referenced. However, in all cases it seems that there is no regression so should we move forward on merging while I run some more local tests?

Thanks,
Paul

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2024-03-15 15:56:14 Re: Weird test mixup
Previous Message Nathan Bossart 2024-03-15 15:06:11 Re: Popcount optimization using AVX512