Re: Assertion failure with barriers in parallel hash join

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Assertion failure with barriers in parallel hash join
Date: 2020-10-02 03:07:15
Message-ID: CA+hUKGJuQUK6j2EwJcv5gcLPUCZ=qk0o36VtjL+s-bMV0GURJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 29, 2020 at 9:12 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Tue, Sep 29, 2020 at 7:11 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > #2 0x00000000009027d2 in ExceptionalCondition
> > (conditionName=conditionName(at)entry=0xa80846 "!barrier->static_party",
>
> > #4 0x0000000000682ebf in ExecParallelHashJoinNewBatch
>
> Thanks. Ohhh. I think I see how that condition was reached and what
> to do about it, but I'll need to look more closely. I'm away on
> vacation right now, and will update in a couple of days when I'm back
> at a real computer.

Here's a throw-away patch to add some sleeps that trigger the problem,
and a first draft fix. I'll do some more testing of this next week
and see if I can simplify it.

Attachment Content-Type Size
0001-Inject-fault-timing.patch text/x-patch 1.4 KB
0002-Fix-race-condition-in-parallel-hash-join-batch-clean.patch text/x-patch 9.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiro Ikeda 2020-10-02 03:40:58 Re: New statistics for tuning WAL buffer size
Previous Message Kyotaro Horiguchi 2020-10-02 03:02:17 Re: Why does PostgresNode.pm set such a low value of max_wal_senders?