It looks like this patch was developed concurrently with the switch from
pg_attribute_aligned to C11 standard alignas, and it ended up being
committed with the "old" style. I propose the attached patch to update
that. It's only a stylistic change, but good for consistency and as an
example for future code.
On 03.04.26 10:22, John Naylor wrote:
> On Thu, Apr 2, 2026 at 11:17 PM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>>
>> On Thu, Apr 02, 2026 at 10:53:24AM -0500, Nathan Bossart wrote:
>>> I think the new pg_comp_crc32_choose() is infinitely recursing on macOS
>>> because USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK is not defined but
>>> pg_crc32c_armv8_available() returns false. If I trace through that
>>> function, I see that it's going straight to the
>>>
>>> #else
>>> return false;
>>> #endif
>>>
>>> at the end. And sure enough, both HAVE_ELF_AUX_INFO and HAVE_GETAUXVAL
>
> Ah of course.
>
>>> aren't defined in pg_config.h. I think we might need to use sysctlbyname()
>>> to determine PMULL support on macOS, but at this stage of the development
>>> cycle, I would probably lean towards just compiling in the sb8
>>> implementation.
>>
>> Hm. On second thought, that probably regresses macOS builds because it was
>> presumably using the armv8 path without runtime checks before...
>
> I went with the following for v5, and it passes MacOS on my Github CI:
>
> + /* set fallbacks */
> +#ifdef USE_ARMV8_CRC32C
> + /* On e.g. MacOS, our runtime feature detection doesn't work */
> + pg_comp_crc32c = pg_comp_crc32c_armv8;
> +#else
> + pg_comp_crc32c = pg_comp_crc32c_sb8;
> +#endif
> + [...crc and pmull checks]
>
> That should keep scalar hardware support working, but now it'll only
> use direct calls for constant inputs.
>
> I also did some benchmarking on an ARM Neoverse N1 / gcc 8.3
> (attached). There the vector loop still works well all the way down to
> the minimum input size of 64 bytes, and on long inputs it's almost
> twice as fast as scalar. For reproduceability, I slightly modified the
> benchmark we used last year, to make sure the input is aligned
> (attached but not for CI). In the end, I want to add a length check so
> that inputs smaller than 80 bytes go straight to the scalar path.
> Above 80, after alignment adjustments in the preamble, that still
> guarantees at least one loop iteration in the vector path.
>
> --
> John Naylor
> Amazon Web Services