Quick Links

Re: add AVX2 support to simd.h

From:	Nathan Bossart <nathandbossart(at)gmail(dot)com>
To:	John Naylor <johncnaylorls(at)gmail(dot)com>
Cc:	Ants Aasma <ants(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: add AVX2 support to simd.h
Date:	2024-03-25 21:37:54
Message-ID:	20240325213754.GA3094030@nathanxps13
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Here is what I have staged for commit. One notable difference in this
version of the patch is that I've changed

+ if (nelem <= nelem_per_iteration)
+ goto one_by_one;

+ if (nelem < nelem_per_iteration)
+ goto one_by_one;

I realized that there's no reason to jump to the one-by-one linear search
code when nelem == nelem_per_iteration, as the worst thing that will happen
is that we'll process all the elements twice if the value isn't present in
the array. My benchmark that I've been using also shows a significant
speedup for this case with this change (on the order of 75%), which I
imagine might be due to a combination of branch prediction, caching, fewer
instructions, etc.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment	Content-Type	Size
v9-0001-Micro-optimize-pg_lfind32.patch	text/x-diff	5.4 KB

In response to

Re: add AVX2 support to simd.h at 2024-03-25 15:21:21 from Nathan Bossart

Responses

Re: add AVX2 support to simd.h at 2024-03-26 19:09:04 from Nathan Bossart

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2024-03-25 21:44:08	Re: Add bump memory context type and use it for tuplesorts
Previous Message	Melanie Plageman	2024-03-25 21:11:20	Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE