RE: Use __attribute__((target(sse4.2))) for SSE42 CRC32C

From: "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Use __attribute__((target(sse4.2))) for SSE42 CRC32C
Date: 2024-11-07 21:30:32
Message-ID: PH8PR11MB82868168DEB5CA1FBE9EEDC1FB5C2@PH8PR11MB8286.namprd11.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> + __attribute__((target("sse4.2")))
>
> These need to be surrounded with
>
> #if defined(__has_attribute) && __has_attribute (target)
>
> so that we still attempt the check on compilers that don't support it (e.g., MSVC).

I assumed configure was only used on linux and ignored this check. Doesn't hurt to add this I suppose.

>
> # Check for Intel SSE 4.2 intrinsics to do CRC calculations.
> #
> -# First check if the _mm_crc32_u8 and _mm_crc32_u64 intrinsics can be used -#
> with the default compiler flags. If not, check if adding the -msse4.2 -# flag helps.
> CFLAGS_CRC is set to -msse4.2 if that's required.
> +# Check if the _mm_crc32_u8 and _mm_crc32_u64 intrinsics can be used #
> +with the __attribute__((target("sse4.2"))).
> PGAC_SSE42_CRC32_INTRINSICS([])
> -if test x"$pgac_sse42_crc32_intrinsics" != x"yes"; then
> - PGAC_SSE42_CRC32_INTRINSICS([-msse4.2])
> -fi
>
> IIUC this means we will never set USE_SSE42_CRC32C_WITH_RUNTIME_CHECK.

I don't think so. USE_SSE42_CRC32C additionally requires SSE4_2_TARGETED to be true which will only happen when explicitly built with -msse4.2. When this explicit compiler flag is missing, we set USE_SSE42_CRC32C_WITH_RUNTIME_CHECK to true. This logic is further down in the configure.ac file:

if test x"$pgac_sse42_crc32_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = x"1" ; then
USE_SSE42_CRC32C=1
else

> To maintain the existing behavior, I think we need to still perform two configure
> tests (one with __attribute__((target(...))) and another without), and then we can
> choose whether to set USE_SSE42_CRC32C or
> USE_SSE42_CRC32C_WITH_RUNTIME_CHECK based on the results.
>
> + 'pg_crc32c_sse42_choose.c',
> + 'pg_crc32c_sse42.c',
>
> Can we combine these?

Knowing the AVX-512 will come next, I think it makes sense to keep the runtime choose function separate. Otherwise this gets polluted with runtime choose function, sse42 algorithm and the avx512 algorithm in the next patch. Does that make sense?

Raghuveer

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-11-07 21:32:29 Re: magical eref alias names
Previous Message Robert Haas 2024-11-07 21:29:38 Re: magical eref alias names