Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Date: 2023-01-26 20:57:02
Message-ID: CA+hUKGKf9nBEhMOcWUkKNkQp6miCcnM8TYNa-+aq9wpzGXCLJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 27, 2023 at 9:49 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
> > I received an alert dikkop (my rpi4 buildfarm animal running freebsd 14)
> > did not report any results for a couple days, and it seems it got into
> > an infinite loop in REL_11_STABLE when building hash table in a parallel
> > hashjoin, or something like that.
>
> > It seems to be progressing now, probably because I attached gdb to the
> > workers to get backtraces, which does signals etc.
>
> That reminds me of cases that I saw several times on my now-deceased
> animal florican:
>
> https://www.postgresql.org/message-id/flat/2245838.1645902425%40sss.pgh.pa.us
>
> There's clearly something rotten somewhere in there, but whether
> it's our bug or FreeBSD's isn't clear.

And if it's ours, it's possibly in latch code and not anything higher
(I mean, not in condition variables, barriers, or parallel hash join)
because I saw a similar hang in the shm_mq stuff which uses the latch
API directly. Note that 13 switched to kqueue but still used the
self-pipe, and 14 switched to a signal event, and this hasn't been
reported in those releases or later, which makes the poll() code path
a key suspect.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2023-01-26 21:06:31 Re: New strategies for freezing, advancing relfrozenxid early
Previous Message Robert Haas 2023-01-26 20:55:13 Re: Non-superuser subscription owners