Re: Assertion failure with barriers in parallel hash join

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Assertion failure with barriers in parallel hash join
Date: 2021-03-31 22:25:49
Message-ID: CAAKRu_am6AJEY+Y-tuWYQSK=9kMzknSCkMPiNO_tHyoNZSwGzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 17, 2021 at 8:18 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Wed, Mar 17, 2021 at 6:58 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > According to BF animal elver there is something wrong with this
> > commit. Looking into it.
>
> Assertion failure reproduced here and understood, but unfortunately
> it'll take some more time to fix this. I've reverted the commit for
> now to unbreak the ~5 machines that are currently showing red in the
> build farm.

So, I think the premise of the patch
v6-0001-Fix-race-condition-in-parallel-hash-join-batch-cl.patch makes
sense: freeing the hashtable memory should happen atomically with
updating the flag that indicates to all others that the memory is not to
be accessed any longer.

As was likely reported by the buildfarm leading to you reverting the
patch, when the inner side is empty and we dump out before advancing the
build barrier past PHJ_BUILD_HASHING_OUTER, we trip the new Asserts
you've added in ExecHashTableDetach().

Assert(!pstate ||
BarrierPhase(&pstate->build_barrier) >= PHJ_BUILD_RUNNING);

Hmm.

Maybe if the inner side is empty, we can advance the build barrier to
the end?
We help batch 0 along like this in ExecParallelHashJoinSetUpBatches().

But, I'm not sure we can expect the process executing this code to be
attached to the build barrier, can we?

@@ -296,7 +304,19 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
* outer relation.
*/
if (hashtable->totalTuples == 0 &&
!HJ_FILL_OUTER(node))
+ {
+ if (parallel)
+ {
+ Barrier *build_barrier;
+
+ build_barrier =
&parallel_state->build_barrier;
+ while
(BarrierPhase(build_barrier) < PHJ_BUILD_DONE)
+
BarrierArriveAndWait(build_barrier, 0);
+ BarrierDetach(build_barrier);
+ }
+
return NULL;
+ }

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-03-31 22:27:58 Re: New IndexAM API controlling index vacuum strategies
Previous Message Tomas Vondra 2021-03-31 22:25:19 Re: Crash in BRIN minmax-multi indexes