Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Date: 2023-02-06 19:20:15
Message-ID: 20230206192015.yhztexkn2eyp3n4q@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-02-06 19:51:19 +0100, Tomas Vondra wrote:
> > No. The only thing the machine is doing is
> >
> > while /usr/bin/true; do
> > make check
> > done
> >
> > I can't reduce the workload further, because the "join" test is in a
> > separate parallel group (I cut down parallel_schedule). I could make the
> > machine busier, of course.
> >
> > However, the other lockup I saw was when using serial_schedule, so I
> > guess lower concurrency makes it more likely.
> >
>
> FWIW the machine is now on run ~2700 without any further lockups :-/
>
> Seems it was quite lucky we hit it twice in a handful of attempts.

Did you cut down the workload before you reproduced it the first time, or
after? It's quite possible that it's not reproducible in isolation.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-02-06 19:22:58 Re: GUCs to control abbreviated sort keys
Previous Message Andres Freund 2023-02-06 19:18:38 Re: Non-superuser subscription owners