Re: [HACKERS] parallel.c oblivion of worker-startup failures

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] parallel.c oblivion of worker-startup failures
Date: 2017-12-11 04:07:39
Message-ID: CAA4eK1JYmj0AMvPSfXEr0fNx7JYcF7pgOVko5hKjDEh+QCseug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 8, 2017 at 9:45 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Dec 8, 2017 at 4:56 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> I think the optimization you are suggesting has a merit over what I
>> have done in the patch. However, one point to note is that we are
>> anyway going to call that function down in the path(
>> WaitForParallelWorkersToExit->WaitForBackgroundWorkerShutdown->GetBackgroundWorkerPid),
>> so we might want to take the advantage of calling it first time as
>> well. We can actually cache the status of workers that have returned
>> BGWH_STOPPED and use it later so that we don't need to make this
>> function call again for workers which are already stopped. We can
>> even do what Tom is suggesting to avoid calling it for workers which
>> are known to be launched, but I am really not sure if that function
>> call is costly enough that we need to maintain one extra state to
>> avoid that.
>
> Well, let's do what optimization we can without making it too complicated.
>

Okay, see the attached and let me know if that suffices the need?

>> While looking into this code path, I wonder how much we are gaining by
>> having two separate calls (WaitForParallelWorkersToFinish and
>> WaitForParallelWorkersToExit) to check the status of workers after
>> finishing the parallel execution? They are used together in rescan
>> code path, so apparently there is no benefit calling them separately
>> there. OTOH, during shutdown of nodes, it will often be the case that
>> they will be called in a short duration after each other except for
>> the case where we need to shut down the node for the early exit like
>> when there is a limit clause or such.
>
> I'm not 100% sure either, but if we're going to do anything about
> that, it seems like a topic for another patch.
>

makes sense.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
fix_parallel_worker_startup_failures_v1.patch application/octet-stream 1.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-12-11 04:25:22 Re: BUG #14941: Vacuum crashes
Previous Message Tsunakawa, Takayuki 2017-12-11 03:13:00 Added PostgreSQL internals learning materials in Developer FAQ