From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | Andy Fan <zhihuifan1213(at)163(dot)com> |
Cc: | "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jesper Pedersen <jesperpedersen(dot)db(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com> |
Subject: | Re: Improve CRC32C performance on SSE4.2 |
Date: | 2025-06-17 08:55:06 |
Message-ID: | CANWCAZZK0hVk-N71JsuXPOB0ALy8BBOfRjSA4Nz2Kpt2RCLU0Q@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jun 17, 2025 at 6:40 AM Andy Fan <zhihuifan1213(at)163(dot)com> wrote:
>
> "Devulapalli, Raghuveer" <raghuveer(dot)devulapalli(at)intel(dot)com> writes:
>
> > Great catch! From the intrinsic manual:
> >
> > Cast vector of type __m128i to type __m512i; the upper 384 bits of the
> > result are undefined.
Thanks Raghuveer and Nathan, for the diagnosis!
> Just be curious, what kind of optimization (like what -O2 does) could
> mask this issue?
In case Andy is asking about "how" rather than "under what
circumstances", my guess is: -O1+ may have just chosen instructions
that also happen to zero-extend, which are common. -O0 doesn't
represent the naive straightforward structure of what the programmer
wrote, it's more like an "exploded" representation suitable for later
optimization passes. That's why it always looks goofy.
> > Replacing that with _mm512_zextsi128_si512 fixes the problem.
Here's a patch for testing, which also reverts the previous
workaround. Help welcome, but I still promise to test it in the near
future regardless.
--
John Naylor
Amazon Web Services
Attachment | Content-Type | Size |
---|---|---|
v1-zero-extend-instead-of-cast.patch | text/x-patch | 1.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2025-06-17 09:01:13 | Re: [PATCH] Add an ldflags_sl meson build option |
Previous Message | Peter Eisentraut | 2025-06-17 08:44:06 | Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. |