From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Instability of phycodorus in pg_upgrade tests with JIT |
Date: | 2025-10-22 21:00:01 |
Message-ID: | 563ee5af-8ee2-484f-b50a-1c8fbdd16171@gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Andres,
17.10.2025 08:21, Fujii Masao wrote:
> On Fri, Oct 17, 2025 at 8:32 AM Michael Paquier<michael(at)paquier(dot)xyz> wrote:
>> On Thu, Oct 16, 2025 at 10:00:00PM +0300, Alexander Lakhin wrote:
>>> I collected all of such failures here:
>>> https://wiki.postgresql.org/wiki/Known_Buildfarm_Test_Failures#check-pg_upgrade_fails_on_LLVM-enabled_animals_due_to_double_free_or_corruption
>>>
>>> Masao-san was going to dig into that:
>>> https://www.postgresql.org/message-id/CAHGQGwFcjccSYX+Ap8meEbCccUei-B4tmYsBFu4wMEixKi90fQ@mail.gmail.com
> I tried that briefly, but unfortunately I still have no idea what caused
> this failure or what triggered the double-free issue shown below…
I've been trying to reproduce the issue locally for several days, with
clang 3.9.0 and 4.0.1 compiled from sources with -DCMAKE_BUILD_TYPE=Debug
-DLLVM_ENABLE_ASSERTIONS=ON, running buildfarm client (TestUpgrade) on
four different x86_64 systems (Debian, Ubuntu, but not the latest versions), with
no single failure so far.
(I've re-created config from petalura/phycodurus: 'jit=1',
'jit_above_cost=0', 'jit_optimize_above_cost=1000'... also tried
jit_optimize_above_cost=0...)
I tried to invoke double free with a simple program and confirmed that the
double free is detected and the program aborted.
So if I re-created all the conditions (based on buildfarm logs) correctly,
then several hundred runs, which I performed, should be enough to
reproduce the issue, but probably there is something specific with those
animals (petalura, phycodurus, desmoxytes, dragonet)... Maybe a buggy libc
update was installed there in September?
Meanwhile we've got a failure at stage Check (not pg_upgradeCheck), with a
release LLVM build [1]:
2025-10-21 17:15:16.261 CEST [1489783][client backend][:0] LOG: disconnection: session time: 0:00:03.177 user=bf
database=regression host=[local]
corrupted size vs. prev_size while consolidating
Thus, the initial suspicion that the issue is caused by dff7591a7 (because
the first failure [2] happened right after it) seems wrong now.
Maybe you have an insight on the possible cause of these memory errors?
[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2025-10-21%2015%3A14%3A12
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=phycodurus&dt=2025-09-16%2011%3A09%3A07
Best regards,
Alexander
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-10-22 21:16:54 | Re: fix type of infomask parameter in static inline functions |
Previous Message | Sami Imseih | 2025-10-22 20:57:15 | Re: Skip unregistered custom kinds on stats load |