Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Date: 2023-01-28 04:18:39
Message-ID: 3620815.1674879519@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> Except that you're saying that you hit this on elver (amd64), I think it'd be
> interesting that we see the failure on an arm host, which has a less strict
> memory order model than x86.

I also saw it on florican, which is/was an i386 machine using clang and
pretty standard build options other than
'CFLAGS' => '-msse2 -O2',
so I think this isn't too much about machine architecture or compiler
flags.

Machine speed might matter though. elver is a good deal faster than
florican was, and dikkop is slower yet. I gather Thomas has seen this
only once on elver, but I saw it maybe a dozen times over a couple of
years on florican, and now dikkop has hit it after not so many runs.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-01-28 04:26:25 Re: Something is wrong with wal_compression
Previous Message Nathan Bossart 2023-01-28 04:17:46 Re: recovery modules