Re: Parallel Append implementation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-02-26 17:26:25
Message-ID: CA+Tgmoa1wweR9OSMYQm=H47BNG32c_cX6tXTShRZK8VZk=a7mw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 20, 2017 at 10:54 AM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> On Sun, Feb 19, 2017 at 2:33 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Fri, Feb 17, 2017 at 11:44 AM, Ashutosh Bapat
>> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>> That's true for a partitioned table, but not necessarily for every
>>> append relation. Amit's patch is generic for all append relations. If
>>> the child plans are joins or subquery segments of set operations, I
>>> doubt if the same logic works. It may be better if we throw as many
>>> workers (or some function "summing" those up) as specified by those
>>> subplans. I guess, we have to use different logic for append relations
>>> which are base relations and append relations which are not base
>>> relations.
>>
>> Well, I for one do not believe that if somebody writes a UNION ALL
>> with 100 branches, they should get 100 (or 99) workers. Generally
>> speaking, the sweet spot for parallel workers on queries we've tested
>> so far has been between 1 and 4. It's straining credulity to believe
>> that the number that's correct for parallel append is more than an
>> order of magnitude larger. Since increasing resource commitment by
>> the logarithm of the problem size has worked reasonably well for table
>> scans, I believe we should pursue a similar approach here.
>
> Thanks for that explanation. I makes sense. So, something like this
> would work: total number of workers = some function of log(sum of
> sizes of relations). The number of workers allotted to each segment
> are restricted to the the number of workers chosen by the planner
> while planning that segment. The patch takes care of the limit right
> now. It needs to incorporate the calculation for total number of
> workers for append.

log(sum of sizes of relations) isn't well-defined for a UNION ALL query.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-02-26 17:38:09 Re: Should logtape.c blocks be of type long?
Previous Message Robert Haas 2017-02-26 17:25:21 Re: tab completion for partitioning