From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Yuqi Gu <Yuqi(dot)Gu(at)arm(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Optimize Arm64 crc32c implementation in Postgresql |
Date: | 2018-03-05 18:44:35 |
Message-ID: | 6811959f-e7e5-74e2-4645-d5eb0d40d10d@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02/03/18 06:42, Andres Freund wrote:
> On 2018-03-02 11:37:52 +1300, Thomas Munro wrote:
>> So... that stuff probably needs either a configure check for the
>> getauxval function and/or those headers, or an OS check?
>
> It'd probably be better to not rely on os specific headers, and instead
> directly access the capabilities.
Anyone got an idea on how to do that? I googled around a bit, but
couldn't find any examples. All the examples I could find very
Linux-specific, and used getauxval(), except for this in the FreeBSD
kernel itself:
https://github.com/freebsd/freebsd/blob/master/sys/libkern/crc32.c#L775.
I'm no expert on FreeBSD, but that doesn't seem suitable for use in a
user program.
In any case, I reworked this patch to follow the example of the existing
code more closely. Notable changes:
* Use compiler intrinsics instead of inline assembly.
* If the target architecture has them, use the CRC instructions without
a runtime check. You'll get that if you use "CFLAGS=armv8.1-a", for
example, as the CRC Extension was made mandatory in ARM v8.1. This
should work even on FreeBSD or other non-Linux systems, where
getauxval() is not available.
* I removed the loop to handle two uint64's at a time, using the LDP
instruction. I couldn't find a compiler intrinsic for that, and it was
actually slower, at least on the system I have access to, than a
straightforward loop that processes 8 bytes at a time.
* I tested this on Linux, with gcc and clang, on an ARM64 virtual
machine that I had available (not an emulator, but a VM on a shared
ARM64 server).
- Heikki
Attachment | Content-Type | Size |
---|---|---|
arm64ce-crc32c-1.patch | text/x-patch | 22.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Kuzmenkov | 2018-03-05 18:46:04 | Re: IndexJoin memory problem using spgist and boxes |
Previous Message | Pavel Stehule | 2018-03-05 18:41:51 | Re: INOUT parameters in procedures |