Re: Parallel Append implementation

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-02-20 05:24:16
Message-ID: CAFjFpReJSxzeCyef87+L8t+d2y-oFP6G18JQa_b0Z-iw4yzgBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 19, 2017 at 2:33 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Feb 17, 2017 at 11:44 AM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>> That's true for a partitioned table, but not necessarily for every
>> append relation. Amit's patch is generic for all append relations. If
>> the child plans are joins or subquery segments of set operations, I
>> doubt if the same logic works. It may be better if we throw as many
>> workers (or some function "summing" those up) as specified by those
>> subplans. I guess, we have to use different logic for append relations
>> which are base relations and append relations which are not base
>> relations.
>
> Well, I for one do not believe that if somebody writes a UNION ALL
> with 100 branches, they should get 100 (or 99) workers. Generally
> speaking, the sweet spot for parallel workers on queries we've tested
> so far has been between 1 and 4. It's straining credulity to believe
> that the number that's correct for parallel append is more than an
> order of magnitude larger. Since increasing resource commitment by
> the logarithm of the problem size has worked reasonably well for table
> scans, I believe we should pursue a similar approach here.

Thanks for that explanation. I makes sense. So, something like this
would work: total number of workers = some function of log(sum of
sizes of relations). The number of workers allotted to each segment
are restricted to the the number of workers chosen by the planner
while planning that segment. The patch takes care of the limit right
now. It needs to incorporate the calculation for total number of
workers for append.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2017-02-20 06:05:23 Re: GUC for cleanup indexes threshold.
Previous Message Tom Lane 2017-02-20 05:17:25 fd,c just Assert()s that lseek() succeeds