Quick Links

Re: Parallel Append implementation

From:	Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Parallel Append implementation
Date:	2017-04-04 05:28:32
Message-ID:	CAJ3gD9crnBW=apd7n=RynX08EzrLSnyzgfAordEuHHufDfTKhA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Thanks Andres for your review comments. Will get back with the other
comments, but meanwhile some queries about the below particular
comment ...

On 4 April 2017 at 10:17, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2017-04-03 22:13:18 -0400, Robert Haas wrote:
>> On Mon, Apr 3, 2017 at 4:17 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> > Hm. I'm not really convinced by the logic here. Wouldn't it be better
>> > to try to compute the minimum total cost across all workers for
>> > 1..#max_workers for the plans in an iterative manner? I.e. try to map
>> > each of the subplans to 1 (if non-partial) or N workers (partial) using
>> > some fitting algorith (e.g. always choosing the worker(s) that currently
>> > have the least work assigned). I think the current algorithm doesn't
>> > lead to useful #workers for e.g. cases with a lot of non-partial,
>> > high-startup plans - imo a quite reasonable scenario.

I think I might have not understood this part exactly. Are you saying
we need to consider per-subplan parallel_workers to calculate total
number of workers for Append ? I also didn't get about non-partial
subplans. Can you please explain how many workers you think should be
expected with , say , 7 subplans out of which 3 are non-partial
subplans ?

>>
>> Well, that'd be totally unlike what we do in any other case. We only
>> generate a Parallel Seq Scan plan for a given table with one # of
>> workers, and we cost it based on that. We have no way to re-cost it
>> if we changed our mind later about how many workers to use.
>> Eventually, we should probably have something like what you're
>> describing here, but in general, not just for this specific case. One
>> problem, of course, is to avoid having a larger number of workers
>> always look better than a smaller number, which with the current
>> costing model would probably happen a lot.
>
> I don't think the parallel seqscan is comparable in complexity with the
> parallel append case. Each worker there does the same kind of work, and
> if one of them is behind, it'll just do less. But correct sizing will
> be more important with parallel-append, because with non-partial
> subplans the work is absolutely *not* uniform.
>
> Greetings,
>
> Andres Freund

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

In response to

Re: Parallel Append implementation at 2017-04-04 04:47:31 from Andres Freund

Responses

Re: Parallel Append implementation at 2017-04-04 07:07:59 from Amit Khandekar

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tatsuo Ishii	2017-04-04 05:35:13	Re: Statement timeout behavior in extended queries
Previous Message	Tsunakawa, Takayuki	2017-04-04 05:24:40	Re: Statement timeout behavior in extended queries