Re: parallel.c oblivion of worker-startup failures

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel.c oblivion of worker-startup failures
Date: 2017-10-27 12:54:26
Message-ID: CAA4eK1LEzm_u1UHxzmtvYGmFZMc6e3c560XFawERVj_bS+jFbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 19, 2017 at 8:47 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Mon, Sep 18, 2017 at 10:00 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
>>> Attached patch fixes these problems.
>>
>> Hmm, this patch adds a kill(notify_pid) after one call to
>> ForgetBackgroundWorker, but the postmaster has several more such calls.
>> Shouldn't they all notify the notify_pid? Should we move that
>> functionality into ForgetBackgroundWorker itself, so we can't forget
>> it again?
>>
>
> Among other places, we already notify during
> ReportBackgroundWorkerExit(). However, it seems to me that all other
> places except where this patch has added a call to notify doesn't need
> such a call. The other places like in DetermineSleepTime and
> ResetBackgroundWorkerCrashTimes is called for a crashed worker which I
> don't think requires notification to the backend as that backend
> itself would have restarted. The other place where we call
> ForgetBackgroundWorker is in maybe_start_bgworkers when rw_terminate
> is set to true which again seems to be either the case of worker crash
> or when someone has explicitly asked to terminate the worker in which
> case we already send a notification.
>
> I think we need to notify the backend on start, stop and failure to
> start a worker. OTOH, if it is harmless to send a notification even
> after the worker is crashed, then we can just move that functionality
> into ForgetBackgroundWorker itself as that will simplify the code and
> infact that is the first thing that occurred to me as well, but I
> haven't done that way as I was not sure if we want to send
> notification in all kind of cases.
>

The patch still applies (with some hunks). I have added it in CF [1]
to avoid losing track.

[1] - https://commitfest.postgresql.org/15/1341/

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-10-27 12:55:08 Re: WIP: BRIN bloom indexes
Previous Message Alvaro Herrera 2017-10-27 12:45:55 Re: MERGE SQL Statement for PG11