Re: Choosing parallel_degree

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, James Sewell <james(dot)sewell(at)lisasoft(dot)com>
Subject: Re: Choosing parallel_degree
Date: 2016-03-16 02:05:34
Message-ID: CA+TgmoZugf745untT_f=bk+RJgK9WEiiN85p7KvfVPtfgKjENQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 15, 2016 at 8:45 PM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> On 16 March 2016 at 13:26, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com> wrote:
>> On 15/03/2016 21:12, Robert Haas wrote:
>>> I thought about this a bit more. There are a couple of easy things we
>>> could do here.
>>>
>>> The 1000-page threshold could be made into a GUC.
>>>
>>> We could add a per-table reloption for parallel-degree that would
>>> override the calculation.
>>>
>>> Neither of those things is very smart, but they'd probably both help
>>> some people. If someone is able to produce a patch for either or both
>>> of these things *quickly*, we could possibly try to squeeze it into
>>> 9.6 as a cleanup of work already done.
>>>
>>
>> I'm not too familiar with parallel planning, but I tried to implement
>> both in attached patch. I didn't put much effort into the
>> parallel_threshold GUC documentation, because I didn't really see a good
>> way to explain it. I'd e happy to improve it if needed. Also, to make
>> this parameter easier to tune for users, perhaps we could divide the
>> default value by 3 and use it as is in the first iteration in
>> create_parallel_path() ?
>>
>> Also, global max_parallel_degree still needs to be at least 1 for the
>> per table value to be considered.
>
> Thanks for working on this. I've only skimmed the patch so far, but
> will try to look more closely later.
>
> This did get me wondering why we have the parallel_threshold at all,
> and not just allow the parallel_setup_cost to make parallel plans look
> less favourable for smaller relations. I assume that this is so that
> we don't burden the planner with the overhead of generating parallel
> paths for smaller relations?

Right. And, also, we need some heuristic for judging how many workers
to deploy. parallel_setup_cost is of no use in making that decision.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-03-16 02:07:37 Re: insufficient qualification of some objects in dump files
Previous Message Robert Haas 2016-03-16 02:04:27 Re: Parallel Aggregate