Re: Improve CRC32C performance on SSE4.2

From: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andy Fan <zhihuifan1213(at)163(dot)com>, "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com>, Jesper Pedersen <jesperpedersen(dot)db(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>
Subject: Re: Improve CRC32C performance on SSE4.2
Date: 2025-07-13 19:28:11
Message-ID: CAE-ML+-X8mnx-AsD-9QtB7rkWvCmcb4+VJWOrg0KPu5K2mucSA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 17, 2025 at 1:55 AM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:

I took the minimal repro from [1] and took a look at the code generated
between clang 17 -O0 [2] and clang 17 -O3 [3]. I saw that -O3 (and
actually -O1 and -O2) generated the following code for:

castval = _mm512_castsi128_si512(_mm_cvtsi32_si128(crc0));
x0 = _mm512_xor_si512(castval, x0);

vinserti128 ymm0, ymm0, xmmword ptr [rip + .LCPI1_0], 0
vpxorq zmm0, zmm0, zmmword ptr [rdi]

Reading vpxorq's pseudocode [4], it seems that it zeroes out the leading
bits:

DEST[MAXVL-1:VL] := 0

Same thing for clang 17 -O0, if we are using _mm512_zextsi128_si512
instead [5] - vpxor and vbroadcast128 are used which seem to also
zero out leading bits.

So, -O1..-O3 were indeed emitting instructions that zero-extend and, thus
avoiding the undefined behavior.

[1]
https://www.postgresql.org/message-id/PH8PR11MB8286A89AF2B104044187E54DFB70A%40PH8PR11MB8286.namprd11.prod.outlook.com
[2] https://godbolt.org/z/ahx9PePYr
[3] https://godbolt.org/z/W4WPzjnbb
[4] https://www.felixcloutier.com/x86/pxor#vpxorq--evex-encoded-versions-
[5] https://godbolt.org/z/46brvrnnv

Regards,
Deep (VMware)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-07-13 19:43:08 Re: ABI Compliance Checker GSoC Project
Previous Message Melanie Plageman 2025-07-13 19:15:22 Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)