use ARM intrinsics in pg_lfind32() where available

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: john(dot)naylor(at)enterprisedb(dot)com
Subject: use ARM intrinsics in pg_lfind32() where available
Date: 2022-08-19 20:08:29
Message-ID: 20220819200829.GA395728@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi hackers,

This is a follow-up for recent changes that optimized [sub]xip lookups in
XidInMVCCSnapshot() on Intel hardware [0] [1]. I've attached a patch that
uses ARM Advanced SIMD (Neon) intrinsic functions where available to speed
up the search. The approach is nearly identical to the SSE2 version, and
the usual benchmark [2] shows similar improvements.

writers head simd
8 866 836
16 849 833
32 782 822
64 846 833
128 805 821
256 722 739
512 529 674
768 374 608
1024 268 522

I've tested the patch on a recent macOS (M1 Pro) and Amazon Linux
(Graviton2), and I've confirmed that the instructions aren't used on a
Linux/Intel machine. I did add a new configure check to see if the
relevant intrinsics are available, but I didn't add a runtime check like
there is for the CRC instructions since the compilers I used support these
intrinsics by default. (I don't think a runtime check would work very well
with the inline function, anyway.) AFAICT these intrinsics are pretty
standard on aarch64, although IIUC the spec indicates that they are
technically optional. I suspect that a simple check for "aarch64" would be
sufficient, but I haven't investigated the level of compiler support yet.



Nathan Bossart
Amazon Web Services:

Attachment Content-Type Size
v1-0001-Use-ARM-Advanced-SIMD-intrinsic-functions-in-pg_l.patch text/x-diff 6.7 KB


Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-08-19 20:42:15 Re: [PATCH] Optimize json_lex_string by batching character copying
Previous Message Ranier Vilela 2022-08-19 19:30:19 Re: Use array as object (src/fe_utils/parallel_slot.c)