Re: LLVM JIT: any JIT-compiled query crashes (SIGILL) on a libLLVM 19 + ASAN build

From: Henson Choi <assam258(at)gmail(dot)com>
To: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LLVM JIT: any JIT-compiled query crashes (SIGILL) on a libLLVM 19 + ASAN build
Date: 2026-06-12 01:48:32
Message-ID: CAAAe_zDtwoL8KaC_cpK4rU9jCrLxtGJ=TwLogdfH1PeE3_GT9Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Matheus,

Thanks for digging into this, and for the patch!

> I think that the fix is to filter out sanitizer flags when generating
> bitcode for the JIT code [...]
> With this fix, JIT works correctly under ASAN + LLVM 19 on my machine.

Confirmed here too: with your filter applied the crash is gone and the JIT
runs normally under ASAN. Filtering the sanitizer flags out of the
bitcode is the right fix.

> the sanitizer instrumentation may change struct layouts in the generated
> LLVM IR [...] FIELDNO_EXPRSTATE_PARENT = 11 [...]

One nit: on libLLVM 20.1.8 the bitcode struct layout is identical with and
without -fsanitize=address (e.g. %struct.ExprState, index 11 stays a
pointer), so it isn't a FIELDNO/layout mismatch here. In short, the crash
needs debug info (-ggdb) and sanitizer instrumentation to both land in the
JIT bitcode: the SIGILL is in decodeDiscriminator(), i.e. the instrumented
IR going through the debug-info path. Your fix keeps the debug info but
drops the instrumentation, and that alone stops it -- so the
instrumentation is the trigger. The LLVM 19 assertion is likely the same
cause surfacing differently.

> I'm also wondering if this happens only with LLVM 19 or other versions
> too.

Not LLVM 19 only -- I reproduced the same SIGILL on libLLVM 20.1.8.

v2 series attached, folding in your fix:

0001 Add a "jit" regression test (renamed/minimized from "jit_crash").
jit is off by default now, so this turns it on to push a trivial
query through the JIT provider.

0002 Your meson fix, with an added warning() so a sanitizer build knows
its JIT code won't be instrumented. (Author: Matheus Alcantara.)

0003 Same for autoconf: filter sanitizer flags from BITCODE_CFLAGS/
CXXFLAGS with a configure warning, plus -g under --enable-debug so
the bitcode keeps debug info. The -g part is a judgment call --
autoconf just rebuilds BITCODE_CFLAGS from a whitelist that
doesn't include -g -- so feel free to keep or drop it.

Tested on both build systems with an ASAN backend: the jit test crashes
before and passes after, JIT stays functional (pg_jit_available() = t,
EXPLAIN ANALYZE shows functions compiled), and the warning fires.

Thanks,
Henson

Attachment Content-Type Size
v2-0001-jit-test.patch application/octet-stream 2.9 KB
v2-0002-meson-strip-sanitizer.patch application/octet-stream 1.9 KB
v2-0003-configure-strip-sanitizer.patch application/octet-stream 3.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ewan Young 2026-06-12 01:53:43 Re: Fix warning: ‘startpos’ may be used uninitialized in function ‘results_differ’
Previous Message zengman 2026-06-12 01:36:28 Re: (SQL/PGQ) Clean up orphaned properties when dropping a label