Re: add AVX2 support to simd.h

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: Ants Aasma <ants(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: add AVX2 support to simd.h
Date: 2024-03-20 14:31:16
Message-ID: 20240320143116.GA1313857@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 20, 2024 at 01:57:54PM +0700, John Naylor wrote:
> On Tue, Mar 19, 2024 at 11:30 PM Nathan Bossart
> <nathandbossart(at)gmail(dot)com> wrote:
>> I tried to trim some of the branches, and came up with the attached patch.
>> I don't think this is exactly what you were suggesting, but I think it's
>> relatively close. My testing showed decent benefits from using 2 vectors
>> when there aren't enough elements for 4, so I've tried to keep that part
>> intact.
>
> I would caution against that if the benchmark is repeatedly running
> against a static number of elements, because the branch predictor will
> be right all the time (except maybe when it exits a loop, not sure).
> We probably don't need to go to the trouble to construct a benchmark
> with some added randomness, but we have be careful not to overfit what
> the test is actually measuring.

I don't mind removing the 2-register stuff if that's what you think we
should do. I'm cautiously optimistic that it'd help more than the extra
branch prediction might hurt, and it'd at least help avoid regressing the
lower end for the larger AVX2 registers, but I probably won't be able to
prove that without constructing another benchmark. And TBH I'm not sure
it'll significantly impact any real-world workload, anyway.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2024-03-20 14:31:17 Re: minor tweak to catalogs.sgml pg_class.reltablespace
Previous Message Laurenz Albe 2024-03-20 14:28:53 Re: Regression tests fail with musl libc because libpq.so can't be loaded