Re: strange parallel query behavior after OOM crashes

From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: strange parallel query behavior after OOM crashes
Date: 2017-04-05 07:05:21
Message-ID: CAGz5QCLvKrTqfYyMQCr4KBvYBVtrDTcKjxpxE-FmJqZf3Qgp2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 4, 2017 at 11:22 PM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> On 04/04/2017 06:52 PM, Robert Haas wrote:
>>
>> On Mon, Apr 3, 2017 at 6:08 AM, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
>> wrote:
>>>
>>> On Fri, Mar 31, 2017 at 6:50 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
>>> wrote:
>>>>
>>>> On Thu, Mar 30, 2017 at 4:35 PM, Kuntal Ghosh
>>>> <kuntalghosh(dot)2007(at)gmail(dot)com> wrote:
>>>>>
>>>>> 2. the server restarts automatically, initialize
>>>>> BackgroundWorkerData->parallel_register_count and
>>>>> BackgroundWorkerData->parallel_terminate_count in the shared memory.
>>>>> After that, it calls ForgetBackgroundWorker and it increments
>>>>> parallel_terminate_count.
>>>>
>>>>
>>>> Hmm. So this seems like the root of the problem. Presumably those
>>>> things need to be reset AFTER forgetting any background workers from
>>>> before the crash.
>>>>
>>> IMHO, the fix would be not to increase the terminated parallel worker
>>> count whenever ForgetBackgroundWorker is called due to a bgworker
>>> crash. I've attached a patch for the same. PFA.
>>
>>
>> While I'm not opposed to that approach, I don't think this is a good
>> way to implement it. If you want to pass an explicit flag to
>> ForgetBackgroundWorker telling it whether or not it should performing
>> the increment, fine. But with what you've got here, you're
>> essentially relying on "spooky action at a distance". It would be
>> easy for future code changes to break this, not realizing that
>> somebody's got a hard dependency on 0 having a specific meaning.
>>
>
> I'm probably missing something, but I don't quite understand how these
> values actually survive the crash. I mean, what I observed is OOM followed
> by a restart, so shouldn't BackgroundWorkerShmemInit() simply reset the
> values back to 0? Or do we call ForgetBackgroundWorker() after the crash for
> some reason?
AFAICU, during crash recovery, we wait for all non-syslogger children
to exit, then reset shmem(call BackgroundWorkerShmemInit) and perform
StartupDataBase. While starting the startup process we check if any
bgworker is scheduled for a restart. If a bgworker is crashed and not
meant for a restart(parallel worker in our case), we call
ForgetBackgroundWorker() to discard it.

>
> In any case, the comment right before BackgroundWorkerArray says this:
>
> * These counters can of course overflow, but it's not important here
> * since the subtraction will still give the right number.
>
> which means that this assert
>
> + Assert(BackgroundWorkerData->parallel_register_count >=
> + BackgroundWorkerData->parallel_terminate_count);
>
> is outright broken, just like any other attempts to rely on simple
> comparisons of these two counters, no?
>
IIUC, the assert statements ensures that register count should always
be greater than or equal to the terminate count.
Whereas, the comment refers to the fact that register count and
terminate count indicate the total number of parallel workers spawned
and terminated respectively since the server has been started. At
every session, we're not resetting the counts, hence they can
overflow. But, their substraction should give you the number of active
parallel worker at any instance.
So, I'm not able to see how the assert is broken w.r.t the aforesaid
comment. Am I missing something here?

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-04-05 07:12:17 Re: Statement timeout behavior in extended queries
Previous Message Tsunakawa, Takayuki 2017-04-05 07:03:02 Re: Statement timeout behavior in extended queries