Re: parallel mode and parallel contexts

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel mode and parallel contexts
Date: 2015-01-21 14:10:04
Message-ID: CAA4eK1JKq3T7PijBoVNhzscgOUNHSbSSFu1GkktPAD=Q0g-DCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 21, 2015 at 6:35 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Jan 21, 2015 at 2:11 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > On Tue, Jan 20, 2015 at 9:52 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> >> On Tue, Jan 20, 2015 at 9:41 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> >> wrote:
> >> > It seems [WaitForBackgroundWorkerShutdown] has possibility to wait
> >> > forever.
> >> > Assume one of the worker is not able to start (not able to attach
> >> > to shared memory or some other reason), then status returned by
> >> > GetBackgroundWorkerPid() will be BGWH_NOT_YET_STARTED
> >> > and after that it can wait forever in WaitLatch.
> >>
> >> I don't think that's right. The status only remains
> >> BGWH_NOT_YET_STARTED until the postmaster forks the child process.
> >
> > I think the control flow can reach the above location before
> > postmaster could fork all the workers. Consider a case that
> > we have launched all workers during ExecutorStart phase
> > and then before postmaster could start all workers an error
> > occurs in master backend and then it try to Abort the transaction
> > and destroy the parallel context, at that moment it will get the
> > above status and wait forever in above code.
> >
> > I am able to reproduce above scenario with debugger by using
> > parallel_seqscan patch.
>
> Hmm. Well, if you can reproduce it, there clearly must be a bug. But
> I'm not quite sure where. What should happen in that case is that the
> process that started the worker has to wait for the postmaster to
> actually start it,

Okay, I think this should solve the issue, also it should be done
before call of TerminateBackgroundWorker().

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Arne Scheffer 2015-01-21 14:16:11 Re: Add min and max execute statement time in pg_stat_statement
Previous Message Robert Haas 2015-01-21 13:05:00 Re: parallel mode and parallel contexts