| From: | "Greg Burd" <greg(at)burd(dot)me> |
|---|---|
| To: | "Andres Freund" <andres(at)anarazel(dot)de> |
| Cc: | "Nathan Bossart" <nathandbossart(at)gmail(dot)com>, "John Naylor" <johncnaylorls(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Subject: | Re: Add RISC-V Zbb popcount optimization |
| Date: | 2026-06-01 20:54:27 |
| Message-ID: | d15fe767-e6d1-488e-915a-42794be2cb12@app.fastmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 2026-05-28 10:17:16 -0400, Andres Freund wrote:
> How confident are we that this bug just affects DES?
It's a Clang LoopVectorize wrong-code bug on rv64gcv that we happen to hit in
des_init(); DES is just the spot with riscv build-farm coverage (greenfly).
I reduced it to ~15 lines with no DES semantics: a data-dependent scatter
store dst[idx[i]-1] = i over a directly-visible array, vectorized at
-O2 -march=rv64gcv, comes out wrong. -O1, -fno-vectorize, building without
the V extension, and GCC are all correct:
clang -O2 -march=rv64gcv -> wrong
clang -O1 -march=rv64gcv -> ok
clang -O2 -march=rv64gcv -fno-vectorize -> ok
clang -O2 (no V) -> ok
gcc -O2 -march=rv64gcv -> ok
In the reduced case the scatter is miscompiled into a contiguous copy of the
index array into the destination (vle8.v/vse8.v), dropping both the index and
the stored value; in the full DES code it instead emits an indexed vsoxei8.v.
Either way it's the loop vectorizer, confirmed by -fno-vectorize fixing it.
So any scatter-store loop the vectorizer chooses to handle could be affected,
not just this one. That argues against my per-site barrier approach.
> That seems like a pretty odd fix for the problem. If the problem is
> auto-vectorization, we should stop auto-vectorization, not sprinkle memory
> barriers around.
Agreed. pg_memory_barrier() only works by incidentally blocking the
vectorizer (the memory clobber), which is misleading -- it reads as a
concurrency primitive -- fragile, and per-site. I'll drop it.
> Have you confirmed that, by using a newer clang, the merging of the fixes
> actually fixes the problem?
> ISTM a perfectly viable patch would be to just reject building with a
> non-very-recent clang on riscv.
Confirming empirically now (building trunk clang on the riscv box). Two facts
worth recording first:
- Clang 20.1.2 was tagged 2025-04-02. The LLVM fixes I cited earlier all
merged in Jan 2026, so 20.1.2 contains none of them.
- However, none of those cited issues/PRs is actually this bug. #176105 is a
cost-model change and rv32-only; #176001/#176077 are RISC-V backend
(register coalescer / vmerge COPY-sink) fixes from fuzzer vector_size code;
#171978 and #187458 are unrelated lowering bugs. I can't find any existing
LLVM issue matching an indexed-scatter auto-vectorization correctness
miscompile -- closest but distinct are #80792 and #187402/PR#187802 (LAA
WAW, strided, still open). So a newer clang is not guaranteed to fix it.
- Confirmed: clang 20.1.2 and 21.1.8 both miscompile; clang 22.1.6 is correct
(cross-compiled to riscv64 and run natively on the box). The scatter is
lowered to a unit-stride vse8.v copy (dropping the permutation) on ≤21, and
to a correct vsoxei8.v indexed scatter on 22. So the fix is in clang 22 and
was not backported to 21.1.8.
Given that, the plan:
- Adopt your suggestion: a configure/meson check keyed on __clang_major__ < 22
for riscv. Simplest is to refuse that combination.
- No new LLVM issue needed -- it's already fixed in clang 22. I can still file
one to request a 21.x backport if that's worthwhile.
Either way I'll create a patch that gates the use of the optimization flags
based on the clang version for review ASAP.
The other two patches; Zbb popcount, and Zbc CRC32C, are feature patches that are
valuable but separate. I'll update that thread with those.
best.
-greg
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Joel Jacobson | 2026-06-01 21:33:24 | Re: PostgreSQL 19 Beta 1 release announcement draft |
| Previous Message | Joel Jacobson | 2026-06-01 20:36:36 | Re: Key joins |