Re: crashes due to setting max_parallel_workers=0

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: crashes due to setting max_parallel_workers=0
Date: 2017-03-27 05:29:21
Message-ID: CAGPqQf2aGRD=Mx_8R7ZE7DV-nUE0gGoUX9qWJKxNZBmRR59n0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 27, 2017 at 3:43 AM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> On 03/25/2017 05:18 PM, Rushabh Lathia wrote:
>
>>
>>
>> On Sat, Mar 25, 2017 at 7:01 PM, Peter Eisentraut
>> <peter(dot)eisentraut(at)2ndquadrant(dot)com
>> <mailto:peter(dot)eisentraut(at)2ndquadrant(dot)com>> wrote:
>>
>> On 3/25/17 09:01, David Rowley wrote:
>> > On 25 March 2017 at 23:09, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com
>> <mailto:rushabh(dot)lathia(at)gmail(dot)com>> wrote:
>> >> Also another point which I think we should fix is, when someone set
>> >> max_parallel_workers = 0, we should also set the
>> >> max_parallel_workers_per_gather
>> >> to zero. So that way it we can avoid generating the gather path
>> with
>> >> max_parallel_worker = 0.
>> > I see that it was actually quite useful that it works the way it
>> does.
>> > If it had worked the same as max_parallel_workers_per_gather, then
>> > likely Tomas would never have found this bug.
>>
>> Another problem is that the GUC system doesn't really support cases
>> where the validity of one setting depends on the current value of
>> another setting. So each individual setting needs to be robust
>> against
>> cases of related settings being nonsensical.
>>
>>
>> Okay.
>>
>> About the original issue reported by Tomas, I did more debugging and
>> found that - problem was gather_merge_clear_slots() was not returning
>> the clear slot when nreader is zero (means nworkers_launched = 0).
>> Due to the same scan was continue even all the tuple are exhausted,
>> and then end up with server crash at gather_merge_getnext(). In the patch
>> I also added the Assert into gather_merge_getnext(), about the index
>> should be less then the nreaders + 1 (leader).
>>
>> PFA simple patch to fix the problem.
>>
>>
> I think there are two issues at play, here - the first one is that we
> still produce parallel plans even with max_parallel_workers=0, and the
> second one is the crash in GatherMerge when nworkers=0.
>
> Your patch fixes the latter (thanks for looking into it), which is
> obviously a good thing - getting 0 workers on a busy system is quite
> possible, because all the parallel workers can be already chewing on some
> other query.
>
>
Thanks.

> But it seems a bit futile to produce the parallel plan in the first place,
> because with max_parallel_workers=0 we can't possibly get any parallel
> workers ever. I wonder why compute_parallel_worker() only looks at
> max_parallel_workers_per_gather, i.e. why shouldn't it do:
>
> parallel_workers = Min(parallel_workers, max_parallel_workers);
>
>
I agree with you here. Producing the parallel plan when
max_parallel_workers = 0 is wrong. But rather then your suggested fix, I
think that we should do something like:

/*
* In no case use more than max_parallel_workers_per_gather or
* max_parallel_workers.
*/
parallel_workers = Min(parallel_workers, Min(max_parallel_workers,
max_parallel_workers_per_gather));

> Perhaps this was discussed and is actually intentional, though.
>
>
Yes, I am not quite sure about this.

Regarding handling this at the GUC level - I agree with Peter that that's
> not a good idea. I suppose we could deal with checking the values in the
> GUC check/assign hooks, but what we don't have is a way to undo the changes
> in all the GUCs. That is, if I do
>
> SET max_parallel_workers = 0;
> SET max_parallel_workers = 16;
>
> I expect to end up with just max_parallel_workers GUC changed and nothing
> else.
>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

--
Rushabh Lathia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-03-27 05:38:27 Re: free space map and visibility map
Previous Message Alvaro Herrera 2017-03-27 04:06:13 Re: pg_get_statisticsextdef() is not quite the full shilling