Re: Improve CRC32C performance on SSE4.2

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jesper Pedersen <jesperpedersen(dot)db(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>
Subject: Re: Improve CRC32C performance on SSE4.2
Date: 2025-06-17 08:55:06
Message-ID: CANWCAZZK0hVk-N71JsuXPOB0ALy8BBOfRjSA4Nz2Kpt2RCLU0Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 17, 2025 at 6:40 AM Andy Fan <zhihuifan1213(at)163(dot)com> wrote:
>
> "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com> writes:
>
> > Great catch! From the intrinsic manual:
> >
> > Cast vector of type __m128i to type __m512i; the upper 384 bits of the
> > result are undefined.

Thanks Raghuveer and Nathan, for the diagnosis!

> Just be curious, what kind of optimization (like what -O2 does) could
> mask this issue?

In case Andy is asking about "how" rather than "under what
circumstances", my guess is: -O1+ may have just chosen instructions
that also happen to zero-extend, which are common. -O0 doesn't
represent the naive straightforward structure of what the programmer
wrote, it's more like an "exploded" representation suitable for later
optimization passes. That's why it always looks goofy.

> > Replacing that with _mm512_zextsi128_si512 fixes the problem.

Here's a patch for testing, which also reverts the previous
workaround. Help welcome, but I still promise to test it in the near
future regardless.

--
John Naylor
Amazon Web Services

Attachment Content-Type Size
v1-zero-extend-instead-of-cast.patch text/x-patch 1.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-06-17 09:01:13 Re: [PATCH] Add an ldflags_sl meson build option
Previous Message Peter Eisentraut 2025-06-17 08:44:06 Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.