Re: Query running for very long time (server hanged) with parallel append

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: amitdkhan(dot)pg(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, thomas(dot)munro(at)enterprisedb(dot)com, rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Query running for very long time (server hanged) with parallel append
Date: 2018-02-06 04:34:19
Message-ID: 20180206.133419.02213593.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 5 Feb 2018 15:29:27 +0530, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote in <CAJ3gD9eFR8=kxjPYBEHe34uT9+RYET0VbhGEfSt79eZx3L9E1Q(at)mail(dot)gmail(dot)com>
> On 2 February 2018 at 20:46, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > On Fri, Feb 2, 2018 at 1:43 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> >> The query is actually hanging because one of the workers is in a small
> >> loop where it iterates over the subplans searching for unfinished
> >> plans, and it never comes out of the loop (it's a bug which I am yet
> >> to fix). And it does not make sense to keep CHECK_FOR_INTERRUPTS in
> >> each iteration; it's a small loop that does not pass control to any
> >> other functions .
> >
> > Uh, sounds like we'd better fix that bug.
>
>
> The scenario is this : One of the workers w1 hasn't yet chosen the
> first plan, and all the plans are already finished. So w1 has it's
> node->as_whichplan equal to -1. So the below condition in
> choose_next_subplan_for_worker() never becomes true for this worker :
>
> if (pstate->pa_next_plan == node->as_whichplan)
> {
> /* We've tried everything! */
> pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
> LWLockRelease(&pstate->pa_lock);
> return false;
> }
>
> What I think is : we should save the information about which plan we
> started the search with, before the loop starts. So, we save the
> starting plan value like this, before the loop starts:
> initial_plan = pstate->pa_next_plan;
>
> And then break out of the loop when we come back to to this initial plan :
> if (initial_plan == pstate->pa_next_plan)
> break;
>
> Now, suppose it so happens that initial_plan is a non-partial plan.
> And then when we wrap around to the first partial plan, we check
> (initial_plan == pstate->pa_next_plan) which will never become true
> because initial_plan is less than first_partial_plan.
>
> So what I have done in the patch is : have a flag 'should_wrap_around'
> indicating that we should not wrap around. This flag is true when
> initial_plan is a non-partial plan, in which case we know that we will
> have covered all plans by the time we reach the last plan. So break
> from the loop if this flag is false, or if we have reached the initial
> plan.
>
> Attached is a patch that fixes this issue on the above lines.

The patch adds two new variables and always sets them but warp
around can happen at most once per call so I think it is enough
to arrange to stop at the wrap around time. Addition to that the
patch is forgetting the case of no partial plan. The loop can
overrun on pa_finished in the case.

I think something like the following would work.

@@ -485,6 +485,13 @@ choose_next_subplan_for_worker(AppendState *node)
{
/* Loop back to first partial plan. */
pstate->pa_next_plan = append->first_partial_plan;
+
+ /*
+ * Arrange to bail out if we are trying to fetch the first partial
+ * plan
+ */
+ if (node->as_whichplan < append->first_partial_plan)
+ node->as_whichplan = append->first_partial_plan;
}
else

> >> But I am not sure about this : while the workers are at it, why the
> >> backend that is waiting for the workers does not come out of the wait
> >> state with a SIGINT. I guess the same issue has been discussed in the
> >> mail thread that you pointed.
> >
> > Is it getting stuck here?
> >
> > /*
> > * We can't finish transaction commit or abort until all of the workers
> > * have exited. This means, in particular, that we can't respond to
> > * interrupts at this stage.
> > */
> > HOLD_INTERRUPTS();
> > WaitForParallelWorkersToExit(pcxt);
> > RESUME_INTERRUPTS();
>
> Actually the backend is getting stuck in
> choose_next_subplan_for_leader(), in LWLockAcquire(), waiting for the
> hanging worker to release the pstate->pa_lock. I think there is
> nothing wrong in this, because it is assumed that LWLock wait is going
> to be for very short tiime, and because of this bug, the lwlock waits
> forever.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2018-02-06 04:41:47 Re: Query running for very long time (server hanged) with parallel append
Previous Message Peter Eisentraut 2018-02-06 04:21:44 Re: [COMMITTERS] pgsql: Rearm statement_timeout after each executed query.