Re: strange parallel query behavior after OOM crashes

From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Neha Khatri <nehakhatri5(at)gmail(dot)com>, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: strange parallel query behavior after OOM crashes
Date: 2017-04-11 07:11:47
Message-ID: CAGz5QC+AVEVS+3rBKRq83AxkJLMZ1peMt4nnrQwczxOrmo3CNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 11, 2017 at 2:36 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Apr 10, 2017 at 2:32 PM, Neha Khatri <nehakhatri5(at)gmail(dot)com> wrote:
>> On Tue, Apr 11, 2017 at 1:16 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> 1. Forget BGW_NEVER_RESTART workers in
>>> ResetBackgroundWorkerCrashTimes() rather than leaving them around to
>>> be cleaned up after the conclusion of the restart, so that they go
>>> away before rather than after shared memory is reset.
>>
+1 for the solution.

>> Now with this, would it still be required to forget BGW_NEVER_RESTART
>> workers in maybe_start_bgworker():
>>
>> if (rw->rw_crashed_at != 0)
>> {
>> if (rw->rw_worker.bgw_restart_time == BGW_NEVER_RESTART)
>> {
>> ForgetBackgroundWorker(&iter);
>> continue;
>> }
>> ......
>> }
>
> Well, as noted before, not every background worker failure leads to a
> crash-and-restart; for example, a non-shmem-connected worker or one
> that exits with status 1 will set rw_crashed_at but will not cause a
> crash-and-restart cycle. It's not 100% clear to me whether the code
> you're talking about can be reached in those cases, but I wouldn't bet
> against it.
I think that for above-mentioned background workers, we follow a
different path to call ForgetBackgroundWorker().
CleanupBackgroundWorker()
- ReportBackgroundWorkerExit()
- ForgetBackgroundWorker()
But, I'm not sure for any other cases.

Should we include the assert statement as well?

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2017-04-11 07:17:29 Re: Range Merge Join v1
Previous Message Fabien COELHO 2017-04-11 07:07:44 Re: Variable substitution in psql backtick expansion