Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Date: 2023-09-11 21:04:59
Message-ID: CA+hUKG+Nz0yR-_SU_uPubBaxQimUPhXy9XYqCUwct3NVjeD4YQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 9, 2023 at 9:00 PM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
> Yes, I think we deal with something like that. I can try to deduce a minimum
> change that affects reproducing the issue, but may be it's not that important.
> Perhaps we now should think of escalating the problem to FreeBSD developers?
> I wonder, what kind of reproducer they find acceptable. A standalone C
> program only or maybe a script that compiles/installs postgres and runs
> our test will do too?

We discussed this a bit off-list and I am following up on that. My
guess is that this will turn out to be a bad interaction between that
optimisation and our (former) habit of forking background workers from
inside a signal handler, but let's see...

FTR If someone is annoyed by this and just wants their build farm
animal not to hang on REL_12_STABLE, via Alexander's later experiments
we learned that sysctl kern.sigfastblock_fetch_always=1 fixes the
problem.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2023-09-11 22:13:43 Re: Row pattern recognition
Previous Message Thomas Munro 2023-09-11 20:52:50 Re: Query execution in Perl TAP tests needs work