Re: parallel mode and parallel contexts

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel mode and parallel contexts
Date: 2015-03-06 12:01:54
Message-ID: CAA4eK1+eRkLYAW1w5wE+xotNSdM7WgVFoPSMi2kPJ_r4zwjJFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 15, 2015 at 11:48 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Feb 13, 2015 at 2:22 AM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
> > On Thu, Feb 12, 2015 at 3:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> >>
> >> We're not seeing eye to eye here yet, since I don't accept your
> >> example scenario and you don't accept mine. Let's keep discussing.
> >>
> >> Meanwhile, here's an updated patch.
> >
> > A lot of cool activity is showing up here, so moved the patch to CF
2015-02.
> > Perhaps Andres you could add yourself as a reviewer?
>
> Here's a new version of the patch with (a) some additional
> restrictions on heavyweight locking both at the start of, and during,
> parallel mode and (b) a write-up in the README explaining the
> restrictions and my theory of why the handling of heavyweight locking
> is safe. Hopefully this goes some way towards addressing Andres's
> concerns. I've also replaced the specific (and wrong) messages about
> advisory locks with a more generic message, as previously proposed;
> and I've fixed at least one bug.
>

Today, while testing parallel_seqscan patch, I encountered one
intermittent issue (it hangs in below function) and I have question
related to below function.

+void
+WaitForParallelWorkersToFinish(ParallelContext *pcxt)
+{
..
+ for (;;)
+ {
..
+ CHECK_FOR_INTERRUPTS();
+ for (i = 0; i < pcxt->nworkers; ++i)
+ {
+ if (pcxt->worker[i].error_mqh != NULL)
+ {
+ anyone_alive = true;
+ break;
+ }
+ }
+
+ if (!anyone_alive)
+ break;
+
+ WaitLatch(&MyProc->procLatch, WL_LATCH_SET, -1);
+ ResetLatch(&MyProc->procLatch);
+ }

Isn't there a race condition in this function such that after it finds
that there is some alive worker and before it does WaitLatch(), the
worker completes its work and exits, now in such a case who is
going to wake the backend waiting on procLatch?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-03-06 12:44:19 Re: parallel mode and parallel contexts
Previous Message Greg Stark 2015-03-06 11:23:55 Re: MD5 authentication needs help