support for SSE2 intrinsics

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: support for SSE2 intrinsics
Date: 2022-08-02 10:22:52
Message-ID: CAFBsxsE2G_H_5Wbw+NOPm70-BK4xxKf86-mRzY=L2sLoQqM+-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Recently there have been several threads where the problem at hand lends
itself to using SSE2 SIMD intrinsics. These are convenient because on
64-bit x86 the instructions are always present and so don't need a runtime
check. To integrate them into our code base, we will need to take some
measures for portability, but after looking around it seems fairly
lightweight:

1. Compiler invocation and symbols

Since SSE2 is part of the AMD64 spec, gcc enables it always:

$ gcc -dM -E - < /dev/null | grep SSE | sort
$ gcc -dM -E -msse2 - < /dev/null | grep SSE | sort
#define __MMX_WITH_SSE__ 1
#define __SSE__ 1
#define __SSE2__ 1
#define __SSE2_MATH__ 1
#define __SSE_MATH__ 1

Passing -m32 discards the "MATH" macros but keeps the rest:

$ gcc -dM -E -m32 - < /dev/null | grep SSE | sort
#define __SSE__ 1
#define __SSE2__ 1

Clang behaves similarly.

MSVC doesn't define __SSE2__ (although it does define __AVX__ etc), but we
can just test for _M_X64 or _M_AMD64 (they are equivalent according to [1],
and we have both in our code base already). We could test for __SSE2__ for
32-bit gcc-alikes in the build farm, but I don't think that would tell us
anything interesting, so we can just test for __x86_64__.

2. The intrinsics header

From Peter Cordes on StackOverflow [2]:

```
immintrin.h is portable across all compilers, and includes all Intel SIMD
intrinsics, and some scalar extensions like BMI2 _pdep_u32. (For AMD SSE4a
and XOP (Bulldozer-family only, dropped for Zen), you need to include a
different header as well.)

The only reason I can think of for including <emmintrin.h> specifically
would be if you're using MSVC and want to leave intrinsics undefined for
ISA extensions you don't want to depend on.
```

It seems then that MSVC will compile intrinsics without prompting, so to be
safe we'd need to take the latter advice and use <emmintrin.h>.

3. Support for SSE2 intrinsics

This seems to be well-nigh universal AFAICT and doesn't need to be tested
for at configure time. A quick search doesn't turn up anything weird for
Msys or Cygwin. From [2] again, gcc older than 4.4 can generate poor code,
but there is no mention that correctness is a problem.

4. Helper functions

In a couple proposed patches, there has been some interest in abstracting
some SIMD functionality into functions to hide implementation details away.
I agree there are cases where that would help readability and avoid
duplication.

Given all this, the anti-climax is: it seems we can start with something
like src/include/port/simd.h with:

#if (defined(__x86_64__) || defined(_M_AMD64))
#include <emmintrin.h>
#define USE_SSE2
#endif

(plus a comment summarizing the above)

That we can include into other files, and would be the place to put helper
functions. Thoughts?

[1] https://docs.microsoft.com/en-us/archive/blogs/reiley/macro-revisited
[2]
https://stackoverflow.com/questions/56049110/including-the-correct-intrinsic-header

--
John Naylor
EDB: http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-08-02 10:25:49 Re: Fix obsoleted comments for function prototypes
Previous Message Michael Paquier 2022-08-02 10:21:13 Re: POC: GROUP BY optimization