| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | Matheus Alcantara <matheusssilv97(at)gmail(dot)com> |
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: LLVM JIT: any JIT-compiled query crashes (SIGILL) on a libLLVM 19 + ASAN build |
| Date: | 2026-06-12 01:48:32 |
| Message-ID: | CAAAe_zDtwoL8KaC_cpK4rU9jCrLxtGJ=TwLogdfH1PeE3_GT9Q@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Matheus,
Thanks for digging into this, and for the patch!
> I think that the fix is to filter out sanitizer flags when generating
> bitcode for the JIT code [...]
> With this fix, JIT works correctly under ASAN + LLVM 19 on my machine.
Confirmed here too: with your filter applied the crash is gone and the JIT
runs normally under ASAN. Filtering the sanitizer flags out of the
bitcode is the right fix.
> the sanitizer instrumentation may change struct layouts in the generated
> LLVM IR [...] FIELDNO_EXPRSTATE_PARENT = 11 [...]
One nit: on libLLVM 20.1.8 the bitcode struct layout is identical with and
without -fsanitize=address (e.g. %struct.ExprState, index 11 stays a
pointer), so it isn't a FIELDNO/layout mismatch here. In short, the crash
needs debug info (-ggdb) and sanitizer instrumentation to both land in the
JIT bitcode: the SIGILL is in decodeDiscriminator(), i.e. the instrumented
IR going through the debug-info path. Your fix keeps the debug info but
drops the instrumentation, and that alone stops it -- so the
instrumentation is the trigger. The LLVM 19 assertion is likely the same
cause surfacing differently.
> I'm also wondering if this happens only with LLVM 19 or other versions
> too.
Not LLVM 19 only -- I reproduced the same SIGILL on libLLVM 20.1.8.
v2 series attached, folding in your fix:
0001 Add a "jit" regression test (renamed/minimized from "jit_crash").
jit is off by default now, so this turns it on to push a trivial
query through the JIT provider.
0002 Your meson fix, with an added warning() so a sanitizer build knows
its JIT code won't be instrumented. (Author: Matheus Alcantara.)
0003 Same for autoconf: filter sanitizer flags from BITCODE_CFLAGS/
CXXFLAGS with a configure warning, plus -g under --enable-debug so
the bitcode keeps debug info. The -g part is a judgment call --
autoconf just rebuilds BITCODE_CFLAGS from a whitelist that
doesn't include -g -- so feel free to keep or drop it.
Tested on both build systems with an ASAN backend: the jit test crashes
before and passes after, JIT stays functional (pg_jit_available() = t,
EXPLAIN ANALYZE shows functions compiled), and the warning fires.
Thanks,
Henson
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-jit-test.patch | application/octet-stream | 2.9 KB |
| v2-0002-meson-strip-sanitizer.patch | application/octet-stream | 1.9 KB |
| v2-0003-configure-strip-sanitizer.patch | application/octet-stream | 3.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ewan Young | 2026-06-12 01:53:43 | Re: Fix warning: ‘startpos’ may be used uninitialized in function ‘results_differ’ |
| Previous Message | zengman | 2026-06-12 01:36:28 | Re: (SQL/PGQ) Clean up orphaned properties when dropping a label |