Re: LLVM JIT: any JIT-compiled query crashes (SIGILL) on a libLLVM 19 + ASAN build

From: Henson Choi <assam258(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>
Subject: Re: LLVM JIT: any JIT-compiled query crashes (SIGILL) on a libLLVM 19 + ASAN build
Date: 2026-06-10 02:42:43
Message-ID: CAAAe_zATGTkPtMLyJTAYaX9T+78TFoT0BqbNaGG_71mDuXOZhg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Andres,

Just to let you know — the CI run for this commitfest entry shows the
same crash independently on master as well, so this may not be an RPR
(Reported-Problem Reproduction) issue specific to the patch.

The identical crash occurs on a standalone test against master.

Thanks,
Henson

2026년 6월 10일 (수) 오전 11:09, Henson Choi <assam258(at)gmail(dot)com>님이 작성:

> Hi hackers,
>
> While looking into Andres Freund's note that cfbot is failing with crashes
> inside the JIT on the Row Pattern Recognition patch [1], I found that the
> crash is not specific to that patch at all: on the CI's AddressSanitizer
> build with LLVM 19, any query that is pushed through the LLVM JIT code
> generator crashes the backend with SIGILL. It reproduces on plain master
> with a trivial aggregate, so I am reporting it as its own issue, separate
> from that feature.
>
> Minimal reproduction
> --------------------
>
> SET jit = on;
> SET jit_above_cost = 0;
> SET jit_optimize_above_cost = 0;
> SET jit_inline_above_cost = 0;
>
> SELECT count(*)
> FROM (SELECT i, i * 2 + 1 AS x
> FROM generate_series(1, 100000) i
> WHERE i % 3 = 0) t;
>
> Result:
>
> server closed the connection unexpectedly
> ...
> LOG: client backend (PID NNNNN) was terminated by signal 4: Illegal
> instruction
>
> A postmaster (forked backend) is required to reproduce reliably;
> single-user
> mode does not trip it. With jit = off the same query runs fine.
>
> Environment
> -----------
>
> This is the cfbot Linux task environment:
>
> - Debian Trixie, libLLVM 19.1
> - CFLAGS = -O2 -ggdb -fno-sanitize-recover=all -fsanitize=address
> - LDFLAGS = -fsanitize=address
> - meson: -Dcassert=true -Dinjection_points=true --buildtype=debug
> -Dllvm=enabled (auto_features=disabled)
>
> I reproduced this in a container that mirrors the CI configuration, and
> also
> on a from-scratch build of plain upstream master
> (89eafad297a9b01ad77cfc1ab93a433e0af894b0, "Fix tuple deforming with
> virtual
> generated columns"), which contains no in-flight feature patches.
>
> Backtrace
> ---------
>
> The stack is corrupted at the crash, but with libLLVM debug info the top
> frames resolve consistently to:
>
> Program terminated with signal SIGILL, Illegal instruction.
> #0 getUnsignedFromPrefixEncoding ()
> at llvm/include/llvm/Support/Discriminator.h:34
> #1 decodeDiscriminator ()
> at llvm/lib/IR/DebugInfoMetadata.cpp:283
>
> The crashing rip lands in the middle of a valid instruction
> (decodeDiscriminator+48, the immediate byte of "and $0x1f,%r10d"), i.e. the
> libLLVM code itself is intact and control flow was transferred into it at a
> bad offset. The crash always lands at the same place, for every
> JIT-compiled
> query, which suggests it is systematic rather than random corruption. It
> surfaces in libLLVM's debug-info (discriminator) handling, and persists
> with
> JIT inlining and optimization both disabled.
>
> Reproducer patch
> ----------------
>
> The attached patch adds a small "jit_crash" regression test that forces the
> JIT compiler (jit on, all jit_*_above_cost set to 0) using a plain
> aggregate
> over generate_series(). On a working installation it passes; on the broken
> LLVM 19 + ASAN environment it crashes as above. I have also registered it
> in
> the commitfest so cfbot exercises it directly.
>
> References
> ----------
>
> [1]
> https://www.postgresql.org/message-id/p7r5bekdbl2zcazid7agvfo2nfnq5bim2a5jkckqygld32n325%40fctfp6ou6qnb
>
> Thanks,
> Henson Choi
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Henson Choi 2026-06-10 02:43:30 Re: LLVM JIT: any JIT-compiled query crashes (SIGILL) on a libLLVM 19 + ASAN build
Previous Message Tom Lane 2026-06-10 02:12:48 Re: Fix unqualified catalog references in psql describe queries